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BACKGROUND 

L Field of the Invention 

The present invention relates to novel genes which encode enzymes of the 
o-hydroxylase complex in yeast Candida tropicalis strains. In particular, the invention 
relates to novel genes encoding the cytochrome P450 and NADPH reductase enzymes of 
the G)-hydroxylase complex in yeast Candida tropicalis, and to a method of quantitating 
the expression of genes. 

2. Description of the Related Art 

Aliphatic dioic acids are versatQe chemical intermediates useful as raw 
materials for die preparation of perfumes, polymers, adhesives and macrolid antibiotics. 
While several chemical routes to die syntfiesis of long-chain alpha, oo-dicarboxylic acids are 
available, tiie syntiiesis is not easy and most metiiods result in mixtures containing shorter 
chain lengths. As a result, extensive purification steps are necessary. While it is known tiiat 
long-chain dioic acids can also be produced by microbial transformation of alkanes, fatty 
acids or esters tiiereof, chemical synthesis has remained die most commercially viable 
route, due to limitations with the current biological approaches. 



Several strains of yeast are known to excrete alpha, co-dicarboxylic acids as a 
byproduct when cultured on alkanes or fatty acids as the carbon source. In particular, yeast 
belonging to the Genus Candida, such as C albicans, C cloacae, C guillermondii, Q 
intermedia, C. lipolytica, C. maltosa, C parapsilosis and C, zeylenoides are known to 
produce such dicarboxylic acids {Agr. Biol Chew. 35: 2033-2042 (1971)). Also, various 
strains of C tropicalis are known to produce dicarboxylic acids ranging in chain lengths 
from Cu through Cis (Okino et al., BM Lawrence, BD Mookherjee and BJ Willis (eds), in 
Flavors and Fragrances: A World Perspective, Proceedings of the IC" International 
Conference of Essential Oils, Flavors and Fragrances, Elsevier Science Publishers BV 
Amsterdam (1988)), and are the basis of several patents as reviewed by Buhler and 
Schindler, in Aliphatic Hydrocarbons in Biotechnology, H.J. Rehm and G. Reed (eds), 
Vol. 169, Verlag Chemie, Weinheim (1984), 

Studies of the biochemical processes by which yeasts metabolize alkanes 
and fatty acids have revealed three types of oxidation reactions: a-oxidation of alkanes to 
alcohols, CD-oxidation of fatty acids to alpha, o-dicarboxylic acids and the degradative ^- 
oxidation of fatty acids to CO2 and water. The first two types of oxidations are catalyzed by 
microsomal enzymes while die last type takes place in the peroxisomes. In C tropicalis, 
the first step in the co-oxidation pathway is catalyzed by a membrane-bound enzyme 
complex (co-hydroxylase complex) including a cytochrome P450 monooxygenase and a 
NADPH cytochrome reductase. This hydroxylase complex is responsible for die primary 
oxidation of the terminal mediyl group in alkanes and fatty acids (Gilewicz et al., Can J. 
Microbiol 25:201 (1979)). The genes which encode the cytochrome P450 and NADPH 
reductase components of the complex have previously been identified as P450ALK and 
P450RED respectively, and have also been cloned and sequenced (Sanglard et al.. Gene 
76:121-136 (1989)). P450ALK has also been designated P450ALK1. More recently, ALK 
genes have been designated by the symbol CIT and RED genes have been designated by 
die symbol CFR See, e.g.. Nelson, Pharmacogenetics 6(l):l-42 (1996), which is 
incorporated herein by reference. See also Ohkuma et al., DNA and Cell Biology 14:163- 
173 (1995), Seghezzi et al., DNA and Cell Biology, 1 1:767-780 (1992) and Kargel et al., 
Yeast 12:333-348 (1996), each incorporated herein by reference. For example, P450ALK 
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is also designated CYPS2 according to the nomenclature of Nelson, supra. Fatty acids are 
ultimately formed from alkanes after two additional oxidation steps, catalyzed by alcohol 
oxidase (Kemp et al., AppL Microbiol and BiotechnoL 28: 370-374 (1988)) and aldehyde 
dehydrogenase. The fatty acids can be further oxidized through the same or similar 
pathway to the corresponding dicarboxylic acid. The a)-oxidation of fatty acids proceeds 
via the co-hydroxy fatty acid and its aldehyde derivative, to the corresponding dicarboxylic 
acid without the requirement for CoA activation. However, both fatty acids and 
dicarboxylic acids can be degraded, after activation to the corresponding acyl-CoA ester 
through the p-oxidation pathway in the peroxisomes, leading to chain shortening. In 
mammalian systems, both fatty acid and dicarboxylic acid products of co-oxidation are 
activated to their CoA-esters at equal rates and are substrates for both mitochondrial and 
peroxisomal p-oxidation (/. Biochem., 102:225-234 (1987)). In yeast, P-oxidation takes 
place solely in the peroxisomes {Agr.BioLChem. 49:1821-1828 (1985)). 

It has recently been determined tiiat certain eukaryotes, e.g., certain yeast, 
do not adhere, in^ some respects, to tiie "universal" genetic code which provides tiiat 
particular codons (ttiplets of nucleic acids) code for specific amino acids. Indeed, the 
genetic code is "universal" because it is virttially die same in all living organisms. Certain 
Candida sp. are now known to translate tiie CTG codon (which, according to die 
"universal" code designates leucine) as serine. See, e.g., Ueda et al., Biochewie (1994) 76, 
1217-1222, where C tropicalis, C. cylindracea, C guilliermodii md C. Jusitaniae ^re shown 
to adhere to tiie "non-universal" code witii respect to the CTG codon. Accordingly, nucleic 
acid sequences may code for one amino acid sequence in "universal" code organisms and a 
variant of that amino acid sequence in "non-universal" code organisms depending on the 
number of CTG codons present in die nucleic acid coding sequence. The difference may 
become evident when, in tiie course of genetic engineering, nucleic acid encoding a protein 
is transferred from a "non-universal" code organism to a "universal" code organism or vice 
versa. Obviously, diere will be a different amino acid sequence depending on which 
organism is used to express the protein. 

The production of dicarboxylic acids by fermentation of unsaturated Cu-Cie 
monocarboxylic acids using a strain of die species C tropicalis is disclosed in U.S. Patent 
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4,474,882. The unsaturated dicarboxylic acids correspond to the starting materials in the 
number and position of the double bonds. Similar processes in which other special 
microorganisms are used are described in U.S. Patents 3,975,234 and 4,339,536, in British 
Patent Specification 1,405,026 and in German Patent Publications 21 64 626, 28 53 847, 
29 37 292, 29 51 177, and 21 40 133. 

Cytochromes P450 (P450s) are terminal monooxidases of a 
multicomponent enzyme system as described above. They comprise a superfamily of 
proteins which exist widely in nature having been isolated from a variety of organisms as 
described e.g., in Nelson, supra. These organisms include various mammals, fish, 
invertebrates, plants, mollusk, crustaceans, lower eukaryotes and bacteria (Nelson, supra). 
First discovered in rodent liver microsomes as a carbon-monoxide binding pigment as 
described, e.g., iii Garfinkel, ArcL Biochem, Biophys. 77:493-509 (1958), which is 
incorporated herein by reference, P450s were later named based on their 
absorption at 450 nm in a reduced-CO coupled difference spectrum as described, e.g., in 
Omura et al.,/ BioL Chem, 239:2370-2378 (1964), which is incorporated herein by 
reference. 

P450s catalyze the metabolism of a variety of endogenous and exogenous 
compounds (Nelson, supra). Endogenous compounds include steroids, prostanoids, 
eicosanoids, fat-soluble vitamins, fatty acids, mammalian alkaloids, leukotrines, biogenic 
amines and phytolexins (Nelson, 5Ujora). P450 metabolism involves such reactions as 
epoxidation, hydroxylation, deakylation, N-hydroxylation, sulfoxidation, desulfuration and 
reductive dehalogenation. These reactions generally make the compound more water 
soluble, which is conducive for excretion, and more electrophilic. These electrophilic 
products can have detrimental effects if they react widi DNA or other cellular constituents. 
However, they can react through conjugation with low molecular weight hydrophihc 
substances resulting in glucoronidation, sulfation, acetylation, amino acid conjugation or 
glutathione conjugation typically leading to inactivation and elimination as described, e.g., 
in Klaassen et al., Toxicology, T ed, Macmillan, New York, 1986, incorporated herein by 
reference. 
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P450s are heme thiolate proteins consisting of a heme moiety bound to a 
single polypeptide chain of 45,000 to 55,000 Da. The iron of the heme prosthetic group is 
located at die center of a protoporphyrin ring. Four ligands of the heme iron can be 
attributed to the porphyrin ring. The fifth ligand is a thiolate anion from a cysteinyl residue 
of die polypeptide. The sixdi ligand is probably a hydroxyl group from an amino acid 
residue, or a moiety widi a similar field strengdi such as a water molecule as described, e.g., 
in Goeptar et al., Critical Reviews in Toxicology 25{l):25-65 (1995), incorporated herein by 
reference. 

Monooxygenation reactions catalyzed by cytochromes P450 in a eukaryotic 
membrane-bound system require the transfer of electrons from NADPH to P450 via 
NADPH-cytochrome P450 reductase (CPU) as described, e.g., in Taniguchi et al., Arch 
Biochem, Biophys, 232:585 (1984), incorporated herein by reference. genes are now 
also referred to as A^CP genes. See, e.g., Debacker et al., Antimicrobial Agents and 
Chemotiierapy 45:1660 (2001). CPR is a flavoprotein of approximately 78,000 Da 
containing 1 mol of flavin adenine dinucleotide (FAD) and 1 mol of flavin mononucleotide 
(FMN) per mole of enzyme as described, e.g., in Potter et al.,/ Biol. Chem. 258:6906 
(1983), incorporated herein by reference. The FAD moiety of CPi? is die site of electron 
entry into die enzyme, whereas FMN is die electron-donating site to P450 as described, 
e.g., in Veraiilion et al.,/ BioL Cliem, 253:8812 (1978), incorporated herein by reference. 
The overall reaction is as follows: 

H + RH + NADPH + O. - ROH + NADP + H2O 

Binding of a substrate to die catalytic site of P450 apparendy results in a 
conformational change initiating electron transfer from CPR to P450. Subsequent to die 
transfer of die first electron, 02binds to die Fea -P450 substrate complex to form Y^z -P450- 
substrate complex. This complex is dien reduced by a second electron from CPR, or, in 
some cases, NADH via cytochrome b5 and NADH-cytochrome b5 reductase as described, 
e.g., in Guengerich et al,. Arch. Biochem, Biophys. 205:365 (1980), incorporated herein by 
reference. One atom of tiiis reactive oxygen is introduced into die substrate, while die 
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other is reduced to water. The oxygenated substrate then dissociates, regenerating the 
oxidized form of the cytochrome P450 as described, e.g., in Klassen, Amdur and DouU, 
CasarettandDouirs Toxicology, Macmillan, New York (1986), incorporated herein by 
reference. 

The P450 reaction cycle can be short-circuited in such a way that O2 is 
reduced to O2' and/or H2O2 instead of being utilized for substrate oxygenation. This side 
reaction is often referred to as the "uncoupling of cytochrome P450 as described, e.g., in 
Kuthen et al., Eur J. Biochem. 126:583 (1982) and Poulos et al., FASEB]. 6:674 (1992), 
both of which are incorporated herein by reference. The formation of these oxygen 
radicals may lead to oxidative cell damage as described, e.g., in Mukhopadhyay,/ Biol 
Chem, 269(18):13390-13397 (1994) and Ross et al., Biochem, Pharm, 49(7):979-989 
(1995), both of which are incorporated herein by reference. It has been proposed that 
cytochrome b5's effect on P450 binding to die CPR results in a more stable complex which 
is less likely to become "uncoupled" as described, e.g., in Yamazaki et al., Arch Biochem. 
Biophys. 325 (2): 174- 182 (1996), incorporated herein by reference. 

P450 families are assigned based upon protein sequence comparisons. 
NoUvitiistanding a certain amount of heterogeneity, a practical classification of P450s into 
families can be obtained based on deduced amino acid sequence similarity. P450s with 
amino acid sequence similarity of between about 40 - 80% are considered to be in die same 
family, with sequences of about > 55% belonging to the same subfamily. Those with 
sequence similarity of about < 40% are generally listed as members of different P450 gene 
families (Nelson, supra). A value of about > 97% is taken to indicate allelic variants of die 
same gene, unless proven otherwise based on catalytic activity, sequence divergence in non- 
translated regions of die gene sequence, or chromosomal mapping. 

The most highly conserved region is die HR2 consensus containing die 
invariant cysteine residue near die carboxyl terminus which is required for heme binding as 
described, e.g., in Gotoh et al./. Biochem. 93:807-817 (1983) and Motohashi et al.,/ 
Biochem. 101:879-997 (1987), bodi of which are incorporated herein by reference. 
Additional consensus regions, including die central region of helix I and die 
transmembrane region, have also been identified, as described, e.g, in Goeptar et al., supra 



and Kalb et al„ PNAS. 85:7221-7225 (1988), incorporated herein by reference, although 
the HR2 cysteine is die only invariant amino acid among P450s. 

Short chain (<C12) aliphatic dicarboxylic acids (diacids) are important 
industrial intermediates in die manufacture of diesters and polymers, and find application 
as thermoplastics, plasticizing agents, lubricants, hydraulic fluids, agricultural chemicals, 
pharmaceuticals, dyes, surfactants, and adhesives. The high price and limited availability of 
short chain diacids are due to constraints imposed by the existing chemical synthesis. 

Long-chain diacids (aliphatic a, co-dicarboxylic acids with carbon numbers 
of 12 or greater, hereafter also referred to as diacids) (HOOC-(CH2)n-COOH) are a 
versatile family of chemicals widi demonstrated and potential utility in a variety of chemical 
products including plastics, adhesives, and fragrances. Unfortunately, tfie full market 
potential of diacids has not been realized because chemical processes produce only a 
limited range of these materials at a relatively high price. In addition, chemical processes 
for the production of diacids have a number of limitations and disadvantages. All tiie 
chemical processes are restricted to die production of diacids of specific carbon chain 
lengdis. For example, die dodecanedioic acid process starts widi butadiene. The resulting 
product diacids are limited to multiples of four-carbon lengths and, in practice, only 
dodecanedioic acid is made. The dodecanedioic process is based on nonrenewable 
petrochemical feedstocks. The multireaction conversion process produces unwanted 
byproducts, which result in yield losses, NOx pollution and heavy metal wastes. 

Long-chain diacids offer potential advantages over shorter chain diacids, but 
their high selling price and limited commercial availability prevent widespread growth in 
many of tiiese applications. Biocatalysis offers an innovative way to overcome tiiese 
limitations with a process diat produces a wide range of diacid products from renewable 
feedstocks. However, diere is no commercially viable bioprocess to produce long chain 
diacids from renewable resources. 

SUMMARY OF THE INVENTION 
An isolated nucleic acid is provided which encodes a CPRA protein having 
the amino acid sequence set forth in SEQ ID NO: 83 or SEQ ID NO: 1 17. An isolated 
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nucleic acid is also provided which includes a coding region defined by nucleotides 1006- 
3042 as set forth in SEQ ID NO: 81. An isolated protein is provided which includes an 
amino acid sequence as set forth in SEQ ID NO: 83 or SEQ ID NO: 1 17. A vector is 
provided which includes a nucleotide sequence encoding CPRA protein including an 
amino acid sequence as set forth in SEQ ID NO: 83 or SEQ ID NO: 1 17. A host cell is 
provided which is transfected or transformed with the nucleic acid encoding CPRA 
protein having an amino acid sequence as set forth in SEQ ID NO: 83 or SEQ ID NO: 
1 17. A method of producing a CPRA protein including an amino acid sequence as set 
forth in SEQ ID NO: 83 or SEQ ID NO: 1 17 is also provided which includes a) 
transforming a suitable host cell with a DNA sequence that encodes the protein having 
the amino acid sequence as set forth in SEQ ID NO: 83 or SEQ ID NO: 117; and b) 
culturing the cell under conditions favoring the expression of the protein. 

Ah isolated nucleic acid is provided which encodes a CPRB protein 
having the amino acid sequence set forth in SEQ ID NO: 84 or SEQ ID NO: 1 18. An 
isolated nucleic acid is provided which includes a coding region defined by nucleotides 
1033-3069 as set forth in SEQ ID NO: 82. An isolated protein is provided which 
includes an amino acid sequence as set forth in SEQ ID NO: 84 or SEQ ID NO: 1 18. A 
vector is provided which includes a nucleotide sequence encoding CPRB protein 
including an amino acid sequence as set forth in SEQ ID NO: 84 or SEQ ID NO: 1 18. A 
host cell is provided which is transfected or transformed with the nucleic acid encoding 
CPRB protein having an amino acid sequence as set forth in SEQ ED NO: 84 or SEQ ID 
NO: 1 18. A method of producing a CPRB protein including an amino acid sequence as 
set forth in SEQ ID NO: 84 or SEQ ID NO: 118 is provided which includes a) 
transforming a suitable host cell with a DNA sequence that encodes the protein having 
the amino acid sequence as set forth in SEQ ID NO: 84 or SEQ ID NO: 1 1 8; and b) 
culturing the cell under conditions favoring the expression of the protein. 

An isolated nucleic acid is provided which encodes a CYP52A1A protein 
having the amino acid sequence set forth in SEQ ED NO: 95 or SEQ ID NO: 1 10. An 
isolated nucleic acid is provided which includes a coding region defined by nucleotides 

-8- 



1 177-2748 as set forth in SEQ ID NO: 85. An isolated protein is provided which 
includes an amino acid sequence as set forth in SEQ ID NO: 95 or SEQ ID NO: 1 10. A 
vector is provided which includes a nucleotide sequence encoding CYP52A1A protein 
including an amino acid sequence as set forth in SEQ ID NO: 95 or SEQ ID NO: 1 10. A 
host cell is provided which is transfected or transformed with the nucleic acid encoding 
CYP52A1A protein having an amino acid sequence as set forth in SEQ ID NO: 95 or SEQ 
ID NO: 1 10. A method of producing a CYP52A1A protein including an amino acid 
sequence as set forth in SEQ ID NO: 95 or SEQ ID NO: 1 10 is provided which includes 
a) transforming a suitable host cell with a DNA sequence that encodes the protein having 
the amino acid sequence as set forth in SEQ ID NO: 95 or SEQ ID NO: 1 10; and b) 
culturing the cell under conditions favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52A2A protein is provided which 
has the amino acid sequence set forth in SEQ ID NO: 96. An isolated nucleic acid is 
provided which includes a coding region defmed by nucleotides 1 199-2767 as set forth in 
SEQ ID NO: 86. An isolated protein is provided which includes an amino acid sequence 
as set forth in SEQ ID NO: 96. A vector is provided which includes a nucleotide 
sequence encoding CYP52A2A protein including an amino acid sequence as set forth in 
SEQ ID NO: 96. A host cell is provided which is transfected or transformed with the 
nucleic acid encoding CYP52A2A protein having an amino acid sequence as set forth in 
SEQ ID NO: 96. A method of producing a CYP52A2A protein including an amino acid 
sequence as set forth in SEQ ID NO: 96 is provided which includes a) transforming a 
suitable host cell with a DNA sequence that encodes the protein having the amino acid 
sequence as set forth in SEQ ID NO: 96; and b) culturing the cell under conditions 
favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52A2B protein is provided which 
has the amino acid sequence set forth in SEQ ID NO: 97. An isolated nucleic acid is 
provided which includes a coding region defmed by nucleotides 1072-2640 as set forth in 
SEQ ID NO: 87. An isolated protein is provided which includes an amino acid sequence 
as set forth in SEQ ID NO: 97. A vector is provided which includes a nucleotide 
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sequence encoding CYP52A2B protein including an amino acid sequence as set forth in 
SEQ ID NO: 97. A host cell is provided which is transfected or transformed with the 
nucleic acid encoding CYP52A2B protein having an amino acid sequence as set forth in 
SEQ ID NO: 97. A method of producing a CYP52A2B protein including an amino acid 
sequence as set forth in SEQ ID NO: 97 is provided which includes a) transforming a 
suitable host cell with a DNA sequence that encodes the protein having the amino acid 
sequence as set forth in SEQ ID NO: 97; and b) culturing the cell under conditions 
favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52A3A protein is provided 
which has the amino acid sequence set forth in SEQ ID NO: 98. An isolated nucleic acid 
is provided which includes a coding region defmed by nucleotides 1 126-2748 as set forth 
in SEQ ID NO: 88. An isolated protein is provided which includes an amino acid 
sequence as set forth in SEQ ID NO: 98. A vector is provided which includes a 
nucleotide sequence encoding CYP52A3A protein including an amino acid sequence as 
set forth in SEQ ID NO: 98. A host cell is provided which is transfected or transformed 
with the nucleic acid encoding CYP52A3A protein having an amino acid sequence as set 
forth in SEQ ID NO: 98. A method of producing a CYP52A3A protein including an 
amino acid sequence as set forth in SEQ ID NO: 98 is provided which includes a) 
transforming a suitable host cell with a DNA sequence that encodes the protein having 
the amino acid sequence as set forth in SEQ ID NO: 98; and b) culturing the cell under 
conditions favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52A3B protein is provided 
having the amino acid sequence as set forth in SEQ ID NO: 99 or SEQ ID NO: 1 1 1. An 
isolated nucleic acid is provided which includes a coding region defmed by nucleotides 
913-2535 as set forth in SEQ ID NO: 89. An isolated protein is provided which includes 
an amino acid sequence as set forth in SEQ ID NO: 99 or SEQ ID NO: 1 11. A vector is 
provided which includes a nucleotide sequence encoding CYP52A3B protein including an 
amino acid sequence as set forth in SEQ ID NO: 99 or SEQ ID NO: 1 1 1. A host cell is 
provided which is transfected or transformed with the nucleic acid encoding CYP52A3B 
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protein having an amino acid sequence as set forth in SEQ ID NO: 99 or SEQ ID NO: 
1 1 1 . A method of producing a CYP52A3B protein including an amino acid sequence as 
set forth in SEQ ID NO: 99 or SEQ ID NO: 11 1 is provided which includes a) 
transforming a suitable host cell with a DNA sequence that encodes the protein having 
the amino acid sequence as set forth in SEQ ID NO: 99 or SEQ ID NO: 1 1 1 ; and b) 
culturing the cell under conditions favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52A5A protein is provided 
having the amino acid sequence set forth m SEQ ID NO: 100 or SEQ ID NO: 1 12. An 
isolated nucleic acid is provided which includes a coding region defined by nucleotides 
1 103-2656 as set forth m SEQ ID NO: 90. An isolated protem is provided which 
includes an amino acid sequence as set forth in SEQ ID NO: 100 or SEQ ID NO: 112. A 
vector is provided which includes a nucleotide sequence encoding CYP52A5A protein 
including an amino acid sequence as set forth in SEQ ID NO: 100 or SEQ ID NO: 1 12. 
A host cell is provided which is transfected or transformed with the nucleic acid 
encoding CYP52A5A protein having an amino acid sequence as set forth in SEQ ID NO: 
100 or SEQ ID NO: 1 12. A method of producing a CYP52A5A protein including an 
amino acid sequence as set forth in SEQ ID NO: 100 or SEQ ID NO: 112 is provided 
which includes a) transforming a suitable host cell with a DNA sequence that encodes the 
protein having the amino acid sequence as set forth in SEQ ID NO: 100 or SEQ ID NO: 
1 12; and b) culturing the cell under conditions favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52A5B protein is provided 
having the amino acid sequence as set forth in SEQ ID NO: 101 or SEQ ID NO: 1 13. An 
isolated nucleic acid is provided which includes a coding region defined by nucleotides 
1 142-2695 as set forth in SEQ ID NO: 91. An isolated protein is provided which 
includes an amino acid sequence as set forth in SEQ ID NO: 101 or SEQ ID NO: 1 13. A 
vector is provided which includes a nucleotide sequence encoding CYP52A5B protein 
including the amino acid sequence as set forth in SEQ ID NO: 101 or SEQ ID NO: 1 13. 
A host cell is provided which is transfected or transformed with the nucleic acid 
encoding CYP52A5B protein having the ammo acid sequence as set forth in SEQ ID NO: 

-11- 



101 or SEQ ID NO: 1 13. A method of producing a CYP52A5B protein including an 
amino acid sequence as set forth in SEQ ID NO: 101 or SEQ ID NO: 1 13 is provided 
which includes a) transforming a suitable host cell with a DNA sequence that encodes the 
protein having the amino acid sequence as set forth in SEQ ID NO: 101 or SEQ ID NO: 
1 13; and b) culturing the cell under conditions favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52A8A protein is provided 
having the amino acid sequence set forth in SEQ ID NO: 102 or SEQ ID NO: 114. An 
isolated nucleic acid is provided which includes a coding region defined by nucleotides 
464-2002 as set forth in SEQ ID NO: 92. An isolated protein is provided which includes 
an amino acid sequence as set forth in SEQ ID NO: 102 or SEQ ID NO: 1 14. A vector is 
provided which includes a nucleotide sequence encoding CYP52A8A protein including 
an amino acid sequence as set forth in SEQ ID NO: 102 or SEQ ID NO: 114. A host cell 
is provided which is transfected or transformed with the nucleic acid encoding 
CYP52A8A protein having an amino acid sequence as set forth in SEQ ID NO: 102 or 
SEQ ID NO: 1 14. A method of producing a CYP52A8A protein including an amino acid 
sequence as set forth in SEQ ID NO: 102 or SEQ ID NO: 1 14 is provided which includes 
a) transforming a smtable host cell with a DNA sequence that encodes the protein having 
the amino acid sequence as set forth in SEQ ID NO: 102 or SEQ ID NO: 1 14; and b) 
culturing the cell under conditions favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52A8B protein is provided 
having the amino acid sequence set forth in SEQ ID NO: 103 or SEQ ID NO: 1 15. An 
isolated nucleic acid is provided which includes a coding region defined by nucleotides 
1017-2555 as set forth in SEiQ ID NO: 93. An isolated protein is provided which 
includes an amino acid sequence as set forth in SEQ ID NO: 103 or SEQ ID NO: 1 15. A 
vector is provided which includes a nucleotide sequence encoding CYP52A8B protein 
including an amino acid sequence as set forth in SEQ ID NO: 103 or SEQ ID NO: 115. 
A host cell is provided which is transfected or transformed with the nucleic acid 
encoding CYP52A8B protein having an amino acid sequence as set forth in SEQ ID NO: 
103 or SEQ ID NO: 1 15. A method of producing a CYP52A8B protein including an 
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amino acid sequence as set forth in SEQ ID NO: 103 or SEQ ID NO: 1 15 is provided 
which includes a) transforming a suitable host cell with a DNA sequence that encodes the 
protein having the amino acid sequence as set forth in SEQ ID NO: 103 or SEQ ID NO: 
115; and b) culturing the cell under conditions favoring the expression of the protein. 

An isolated nucleic acid encoding a CYP52D4A protein is provided 
having the amino acid sequence set forth in SEQ ID NO: 104 or SEQ ID NO: 1 16. An 
isolated nucleic acid is provided including a coding region defined by nucleotides 767- 
2266 as set forth in SEQ ID NO: 94. An isolated protein is provided which includes an 
amino acid sequence as set forth in SEQ ID NO: 104 or SEQ ID NO: 1 16. A vector is 
provided which includes a nucleotide sequence encoding CYP52D4A protein including an 
amino acid sequence as set forth in SEQ ID NO: 104 or SEQ ID NO: 1 16. A host cell is 
provided which is transfected or transformed with the nucleic acid encoding CYP52D4A 
protein having an amino acid sequence as set forth in SEQ ID NO: 104 or SEQ ID NO: 
1 16. A method of producing a CYP52D4A protein including an amino acid sequence as 
set forth in SEQ ID NO: 104 or SEQ ID NO: 1 16 is provided which includes a) 
transforming a suitable host cell with a DNA sequence that encodes the protein having 
the amino acid sequence as set forth in SEQ ID NO: 104 or SEQ ID NO: 1 16; and b) 
culturing the cell under conditions favoring the expression of the protein. 

A method for discriminating members of a gene family by quantifying the 
amount of target mRNA in a sample is provided which includes a) providing an 
organism containing a target gene; b) culturing the organism with an organic substrate 
which causes upregulation in the activity of the target gene; c) obtaining a sample of total 
RNA fi-om the organism at a first point in time; d) combining at least a portion of the 
sample of the total RNA with a known amount of competitor RNA to form an RNA 
mixture, wherein the competitor RNA is substantially similar to the target mRNA but has 
a lesser number of nucleotides compared to the target mRNA; e) adding reverse 
transcriptase to the RNA mixture in a quantity sufficient to form corresponding target 
DNA and competitor DNA; (f) conducting a polymerase chain reaction in the presence of 
at least one primer specific for at least one substantially non-homologous region of the 
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target DNA within the gene family, the primer also specific for the competitor DNA; g) 
repeating steps (c-f) using increasing amounts of the competitor RNA while maintaining 
a substantially constant amount of target RNA; h) determining the point at which the 
amount of target DNA is substantially equal to the amount of competitor DNA; i) 
quantifying the results by comparing the ratio of the concentration of unknown target to 
the known concentration of competitor; and j) obtaining a sample of total RNA from the 
organism at another point in time and repeating steps (d-i). 

A method for increasing production of a dicarboxyUc acid is provided 
which includes a) providing a host cell having a naturally occurring number of CPRA 
genes; b) increasing, in the host cell, the number of CPRA genes which encode a CPRA 
protein having the amino acid sequence as set forth in SEQ ID NO: 83 or SEQ ID NO: 
1 17; c) culturing the host cell in media containing an organic substrate which upregulates 
the CPRA gene, to effect increased production of dicarboxyUc acid. 

A method for increasing the production of a CPRA protein having an 
amino acid sequence as set forth in SEQ ID NO: 83 or SEQ ID NO: 1 17 is provided 
which includes a) transforming a host cell having a naturally occurring amount of CPRA 
protein with an increased copy number of a CPRA gene that encodes the CPRA protein 
having the amino acid sequence as set forth in SEQ ID NO: 83 or SEQ ID NO: 1 17; and 
b) culturing the cell and thereby increasing expression of the protein compared with that 
of a host cell containing a naturally occurring copy number of the CPRA gene. 

A method for increasing production of a dicarboxyUc acid is provided 
which includes a) providing a host cell having a naturally occurring number of CPRB 
genes; b) increasing, in the host cell, the number of CPRB genes which encode a CPRB 
protein having the amino acid sequence as set forth in SEQ ID NO: 84 or SEQ ID NO: 
1 18; c) culturing the host cell in media containing an organic substrate which upregulates 
the CPRB gene, to effect increased production of dicarboxyUc acid. 

A method for increasing the production of a CPRB protein having an 
amino acid sequence as set forth in SEQ ID NO: 84 or SEQ ID NO: 1 18 is provided 
which includes a) transforming a host cell having a naturally occurring amount of CPRB 
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protein with an increased copy number of a CPRB gene that encodes the CPRB protein 
having the amino acid sequence as set forth in SEQ ID NO: 84 or SEQ ID NO: 118; and 
b) culturing the cell and thereby increasing expression of the protein compared with that 
of a host cell containing a naturally occurring copy number of the CPRB gene. 

A method for increasing production of a dicarboxylic acid is provided 
which includes a) providing a host cell having a naturally occurring number of 
CYP52A1A genes; b) increasing, in the host cell, the number of CYP52A1A genes which 
encode a CYP52A1A protein having the amino acid sequence as set forth in SEQ ID NO: 
95 or SEQ ID NO: 1 10; c) culturing the host cell in media containing an organic substrate 
which upregulates the CYP52A1A gene, to effect increased production of dicarboxylic 
acid. 

A method for increasing the production of a CYP52A1A protein having an 
amino acid sequence as set forth in SEQ ID NO: 95 or SEQ ID NO: 1 10 is provided 
which includes a) transforming a host cell having a naturally occurring amount of 
CYP52A1A protein with an increased copy number of a CYP52A1A gene that encodes the 
CYP52A1A protein having the amino acid sequence as set forth in SEQ ID NO: 95 or 
SEQ ID NO: 110; and b) culturing the cell and thereby increasing expression of the 
protein compared with that of a host cell containing a naturally occurring copy number of 

the CYP52A1A gene. 

A method for increasing production of a dicarboxylic acid is provided 

which includes a) providing a host cell having a naturally occurring number of 
CYP52A2A genes; b) increasing, in the host cell, the number of CYP52A2A genes which 
encode a CYP52A2A protein having the amino acid sequence as set forth in SEQ ID NO: 
96; c) culturing the host cell in media containing an organic substrate which upregulates 
the CYP52A2A gene, to effect increased production of dicarboxylic acid. 

A method for increasing the production of a CYP52A2A protein having an 
ammo acid sequence as set forth in SEQ ID NO: 96 is provided which includes a) 
transforming a host cell having a naturally occurring amount of CYP52A2A protein with 
an increased copy number of a CYP52A2A gene that encodes the CYP52A2A protein 

-15- 



having the amino acid sequence as set forth in SEQ ID NO: 96; and b) culturing the cell 
and thereby increasing expression of the protein compared with that of a host cell 
containing a naturally occurring copy number of the CYP52A2A gene. 

A method for increasing production of a dicarboxyUc acid is provided 
which includes a) providing a host cell having a naturally occurring number of 
CYP52A2B genes; b) increasing, in the host cell, the number of CYP52A2B genes which 
encode a CYP52A2B protein having the amino acid sequence as set forth in SEQ ID NO: 
97; c) culturing the host cell in media containing an organic substrate which upregulates 
the CYP52A2B gene, to effect increased production of dicarboxyhc acid. 

A method for increasing the production of a CYP52A2B protein having an 
amino acid sequence as set forth in SEQ ID NO: 97 is provided which mcludes a) 
transforming a host cell having a naturally occurring amount oiCYP52A2B protein with 
an increased copy number of a CYP52A2B gene that encodes the CYP52A2B protein 
having the amino acid sequence as set forth in SEQ ID NO: 97; and b) culturing the cell 
and thereby increasing expression of the protein compared with that of a host cell 
containing a naturally occurring copy mmiber of the CYP52A2B gene. 

A method for increasing production of a dicarboxylic acid is provided 
which includes a) providing a host cell having a naturally occurring number of 
CYP52A3A genes; b) increasing, in the host cell, the number ofCYP52A3A genes which 
encode a CYP52A3A protein having the amino acid sequence as set forth in SEQ ID NO: 
98; c) culturing the host cell in media containing an organic substrate which upregulates 
CYP52A3A gene, to effect increased production of dicarboxylic acid. 

A method for increasing the production of a CYP52A3A protein having an 
amino acid sequence as set forth in SEQ ID NO: 98 is provided which includes a) 
transforming a host cell having a naturally occurring amount of CYP52A3A protein with 
an increased copy number of a CYP52A3A gene that encodes the CYP52A3A protein 
having the amino acid sequence as set forth in SEQ ID NO: 98; and b) culturing the cell 
and thereby increasing expression of the protein compared with that of a host cell 
containing a naturally occxuring copy number of the CYP52A3A gene. 
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A method for increasing production of a dicarboxylic acid is provided 
which includes a) providing a host cell having a naturally occurring number of 
CYP52A3B genes; b) increasing, in the host cell, the number of CYP52A3B genes which 
encode a CYP52A3B protein having the amino acid sequence as set forth in SEQ ID NO: 
99 or SEQ ID NO: 11 1; c) culturing the host cell in media containing an organic substrate 
which upregulates the CYP52A3B gene, to effect increased production of dicarboxylic 
acid. 

A method for increasing the production of a CYP52A3B protein having an 
amino acid sequence as set forth in SEQ ID NO: 99 or SEQ ID NO: 1 1 1 is provided 
which includes a) transforming a host cell having a naturally occurring amount of 
CYP52A3B protein with an increased copy number of a CYP52A3B gene that encodes the 
CYP52A3B protein having the amino acid sequence as set forth in SEQ ID NO: 99 or 
SEQ ID NO: 1 1 1; and b) culturing the cell and thereby increasing expression of the 
protein compared with that of a host cell containing a naturally occurring copy number of 
the CYP52A3B gene. 

A method for increasing production of a dicarboxylic acid is provided 
which includes a) providing a host cell having a naturally occurring number of 
CYP52A5A genes; b) increasing, in the host cell, the number of CYP52A5A genes which 
encode a CYP52A5A protein having the amino acid sequence as set forth in SEQ ID NO: 
100 or SEQ ID NO: 1 12; c) culturing the host cell in media containing an organic 
substrate which upregulates the CYP52A5A gene, to effect increased production of 
dicarboxylic acid. 

A method for increasing the production of a CYP52A5A protein having an 
amino acid sequence as set forth in SEQ ID NO: 100 or SEQ ID NO: 1 12 is provided 
which includes a) transforming a host cell having a naturally occurring amount of 
CYP52A5A protein with an increased copy number of a CYP52A5A gene that encodes the 
CYP52A5A protein having the amino acid sequence as set forth in SEQ ID NO: 100 or 
SEQ ID NO: 1 12; and b) culturing the cell and thereby increasing expression of the 
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protein compared with that of a host cell containing a naturally occurring copy number of 

the CYP52A5A gene. 

A method for increasing production of a dicarboxylic acid is provided 
which includes a) providing a host cell having a naturally occurring number of 
CYP52A5B genes; b) increasing, in the host cell, the number ofCYP52A5B genes which 
encode a CYP52A5B protein having the amino acid sequence as set forth in SEQ ID NO: 

101 or SEQ ID NO: 1 13; c) culturing the host cell in media containing an organic 
substrate which upregulates the CYP52A5B gene, to effect increased production of 
dicarboxylic acid. 

A method for increasing the production of a CYP52A5B protein having an 
amino acid sequence as set forth in SEQ ID NO: 101 or SEQ ID NO: 113 is provided 
which includes a) transforming a host cell having a naturally occurring amount of 
CYP52A5B protein with an increased copy number of a CYP52A5B gene that encodes the 
CYP52A5B protein having the amino acid sequence as set forth in SEQ ID NO: 101 or 
SEQ ID NO: 113; andb) culturing the cell and thereby increasing expression of the 
protein compared with that of a host cell containing a naturally occurring copy number of 
the CYP52A5B gene. 

A method for increasing production of a dicarboxylic acid is provided 
which includes a) providing a host cell having a naturally occurring number of 
CYP52A8A genes; b) increasing, in the host cell, the number of CYP52A8A genes which 
encode a CYP52A8A protein having the amino acid sequence as set forth in SEQ ID NO: 

102 or SEQ ID NO: 1 14; c) culturing the host cell in media containing an organic 
substrate which upregulates the CYP52A8A gene, to effect increased production of 
dicarboxylic acid. 

A method for increasing the production of a CYP52A8A protein having an 
amino acid sequence as set forth in SEQ ID NO: 102 or SEQ ID NO: 1 14 is provided 
which includes a) transforming a host cell having a naturally occurring amount of 
CYP52A8A protein with an increased copy number of a CYP52A8A gene that encodes the 
CYP52A8A protein having the amino acid sequence as set forth in SEQ ID NO: 102 or 
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SEQ ID NO: 1 14; and b) culturing the cell and thereby increasing expression of the 
protein compared with that of a host cell containing a naturally occurring copy number of 

the CYP52A8A gene. 

A method for increasing production of a dicarboxylic acid is provided 
which includes a) providing a host cell having a naturally occurring number of 
CYP52A8B genes; b) increasing, in the host cell, the number ofCYP52A8B genes which 
encode a CYP52A8B protein having the amino acid sequence as set forth in SEQ ID NO: 

103 or SEQ ID NO: 115; c) culturing the host cell in media containing an organic 
substrate which upregulates the CYP52A8B gene, to effect increased production of 
dicarboxylic acid. 

A method for mcreasing the production of a CYP52A8B protein having an 
amino acid sequence as set forth in SEQ ID NO: 103 or SEQ ID NO: 115 is provided 
which includes a) transforming a host cell having a naturally occurring amount of 
CYP52A8B protein with an increased copy number of a CYP52A8B gene that encodes the 
CYP52A8B protein having the amino acid sequence as set forth in SEQ ID NO: 103 or 
SEQ ID NO: 115 ; and b) culturing the cell and thereby increasing expression of the 
protein compared with that of a host cell containing a naturally occurring copy number of 
the CYP52A8B gene. 

A method for increasing production of a dicarboxylic acid is provided 
which mcludes a) providing a host cell having a naturally occurring number of 
CYP52D4A genes; b) increasing, in the host cell, the number of CYP52D4A genes which 
encode a CYP52D4A protein having the amino acid sequence as set forth in SEQ ID NO: 

104 or SEQ ID NO: 1 16; c) culturing the host cell in media containing an organic 
substrate which upregulates the CYP52D4A gene, to effect increased production of 
dicarboxylic acid. 

A method for increasing the production of a CYP52D4A protein having an 
amino acid sequence as set forth in SEQ ID NO: 104 or SEQ ID NO: 1 16 is provided 
which includes a) transforming a host cell having a naturally occurring amount of 
CYP52D4A protein with an increased copy number of a CYP52D4A gene that encodes the 
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CYP52D4A protein having the amino acid sequence as set forth in SEQ ID NO: 104 or 
SEQ ID NO: 1 16; and b) culturing the cell and thereby increasing expression of the 
protein compared with that of a host cell containing a naturally occurring copy number of 
the CYP52D4A gene. 

RRTEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic representation of cloning vector pTriplEx from 
Clontech^" Laboratories, Inc. Selected restriction sites within the multiple cloning site are 
shown. 

Figure 2A is a map of die ZAP Express™ vector. 

Figure 2B is a schematic representation of cloning phagemid vector pBK- 

CMV. 

Figure 3 is a double stranded DNA sequence of a portion of the 5 prime 
coding region of the CYPS2AJA gene (SEQ ID NO: 36), die non-coding or antisense 
sequence (SEQ ID NO: 108), primer 7581-97F (SEQ ID NO: 47) and primer 7581-97M 
(SEQ ID NO: 48). 

Figure 4 is a diagrammatic representation of highly conserved regions of 
CFPand CPi? gene protein sequences. Helix I represents die putative substrate binding 
site and HR2 represents the heme binding region. The FMN, FAD and NADPH binding 
regions are indicated below the CPJR gene. 

Figure 5 is a diagrammatic representation of tiie plasmid pHKMl 
containing die truncated CFRA gene present in die pTriplEx vector. A detailed restriction 
map of only die sequenced region is shown at die top. The bar indicates die open reading 
frame. The direction of ti-anscription is indicated by an arrow under die open reading 
frame. 

Figure 6 is a diagrammatic representation of die plasmid pHKM4 
containing die truncated CPJIA gene present in die pTriplEx vector. A detailed restriction 
map of only die sequenced region is shown at die top. The bar indicates die open reading 
frame. The direction of transcription is indicated by an arrow under die open reading 
frame. 
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Figure 7 is a diagrammatic representation of the plasmid pHKM9 
containing the CPBBgtne (SEQ ID NO: 82) present in the pBK-CMV vector. A detailed 
restriction map of only die sequenced region is shown at die top. The bar mdicates die 
open reading frame. The direction of tiranscription is indicated by an arrow under die 
open reading frame. 

Figure 8 is a diagrammatic representation of die plasmid pHKMU 
containing die CYPS2A1A gene (SEQ ID NO: 85) present in die pBK-CMV vector. A 
detailed restiiction map of only die sequenced region is shown at die top. The bar 
indicates tiie open reading frame. The direction of transcription is indicated by an arrow 
under die open reading frame. 

Figure 9 is a diagrammatic representation of die plasmid pHKM12 
containing die CYPS2A8A gene (SEQ ID NO: 92) present in die pBK-CMV vector. A 
detailed restiiction map of only die sequenced region is shown at die top. The bar 
indicates die open reading frame. The direction of tianscription is indicated by an arrow 
imder die open reading frame. 

Figure 10 is a diagrammatic representation of die plasmid pHKMlS 
containing die CYPJ2D4A gene (SEQ ID NO: 94) present in die pBK-CMV vector. A 
detailed restriction map of only die sequenced region is shown at die top. The bar 
indicates die open reading frame. The direction of transcription is indicated by an arrow 
under the open reading frame. 

Figure 11 is a diagrammatic representation of die plasmid pHKM14 
containing die CYPJ2A2B g<inc (SEQ ID NO: 87) present in die pBK-CMV vector. A 
detailed restriction map of only die sequenced region is shovm at die top. The bar 
indicates die open reading frame. The direction of tianscription is indicated by an arrow 
under die open reading frame. 

Figure 12 is a diagrammatic representation of die plasmid pHKM15 
containing die CYPS2A8Bgtne (SEQ ID NO: 93) present in die pBK-CMV vector. A 
detailed restriction map of only the sequenced region is shovm at die top. The bar 
indicates die open reading frame. The direction of tianscription is indicated by an arrow 
under the open reading frame. 
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Figures 13A-13D show the complete DNA sequences including regulatory 
and coding regions for the CPRA gene (SEQ ID NO: 81) and CPRB gene (SEQ ID NO: 
82) from C. tropicalis KYCC 20336. Figures 13A-I3D show regulatory and coding region 
alignment of these sequences. Asterisks indicate conserved nucleotides. The start codons 
are underlined and die last amino acid coding codons immediately before the stop codon 
are underlined. 

Figure 14 shows die amino acid sequence of the CPRA (SEQ ID NO: 83) 
and CPRB (SEQ ID NO: 84) proteins from C. tropicalis KYCC 20336 and alignment of 
these amino acid sequences. Asterisks indicate residues which are not conserved. 

Figures 15A-15M show die complete DNA sequences including regulatory 
and coding regions for die following genes from C. tropicaJis ATCC 20366: (JYPS2A1A 
(SEQ ID NO: 85), CYPS2A2A (SEQ ID NO: 86), CYP52A2B (SEQ ID NO: 87), 
CYP52A3A (SEQ ID NO: 88), CYPS2A3B (SEQ ID NO: 89), CYP52ASA (SEQ ID 
NO. 90), CYPS2ASB (SEQ ID NO: 91), CYPJ2A8A (SEQ ID NO: 92), CYPJ2A8B 
(SEQ ID NO: 93), and CYPS2D4K (SEQ ID NO: 94). Figures 15A-15M show 
regulatory and coding region alignment of diese sequences. Asterisks indicate conserved 
nucleotides. The start codons are underlined and die last amino acid coding codons 
immediately before the stop codon are underlined. 

Figures 16A-16C show die amino acid sequences encoding die CYP52A1A 
(SEQ ID NO: 95), CYPS2A2A (SEQ ID NO: 96), CYP52A2B (SEQ ID NO: 97), 
CYP52A3A (SEQ ID NO: 98), CYPJ2A3B (SEQ ID NO: 99), CYP52AJA (SEQ ID 
NO: 100), CYPS2A5B(Sm\Ti NO: 101), CYPS2A8A (SEQ ID NO: 102), CYP52A8B 
(SEQ ID NO: 103) and CYPS2D4A (SEQ ID NO. 104) proteins from C. tropicalis 
ATCC 20336. Asterisks indicate identical residues and dots indicate conserved residues. 

Figure 17 is a diagrammatic representation of die pTAg PGR product 
cloning vector (commercially available from R&D Systems, Minneapolis, MN). 

Figure 18 is a plot of die log ratio (U/C) of unknown target DNA product to 
competitor DNA product versus die concentration of competitor mRN A. The plot is used 
to calculate die target messenger RNA concentration in a quantitative competitive reverse 
transcription polymerase chain reaction (QC-RT-PCR). 
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Figure 19 is a graph showing the relative induction of C. tropicalis KTCC 
20962 CYP52A3A (SEQ ID NO: 90) by the addition of die fatty acid substrate Emersol® 
267 to the growdi medium. 

Figure 20 is a graph showing die induction of C. tropicalis ATCC 20962 
CYPS22^d CPff genes by Emersol® 267. P450 genes CYPS2A3A (SEQ ID NO: 88), 
CYPS2A3B (SEQ ID NO: 89), and CYPS2D4A (SEQ ID NO: 94) are expressed at levels 
below die detection level of die QC-RT-PCR assay. 

Figure 21 is a scheme to integrate selected genes into die genome of 
Candida tropicalis sti-ains and recovery of URA3A selectable marker. 

Figure 22 is a schematic representation of die transformation of C. tropicalis 
H5343 ura3 widi CIT and/or CPi? genes. Only one URA3\oaxs needs to be functional. 
There are a total of 6 possible ursS, targets (5izra3A loci-2 pox4 disruptions, 2 pox 5 
disruptions, 1 i/raSAlocus; and 1 uraZ^ locus). 

Figure 23 is die complete DNA sequence (SEQ ID NO: 105) encoding 
URA3A from C. tropicalis ATCC 20336 and die amino acid sequence of tiie encoded 
protein (SEQ ID NO: 106). 

Figure 24 is a schematic representation of die plasmid pURAin, die base 
vector for integrating selected genes into die genome of C. tropicalis. The detailed 
construction of pURAin is described in die text. 

Figure 25 is a schematic representation of die plasmid pNEB193 cloning 
vector (commercially available from New En^and Biolabs, Beverly, MA). 

Figure 26 is a diagrammatic representation of die plasmid pPA15 containing 
die truncated CYFS2A2A gene present in die pTriplEx vector. A detailed restriction map 
of only die sequenced region is shown at die top. The bar indicates die open reading 
frame. The direction of transcription is indicated by an arrow under die open reading 
frame. 

Figure 27 is a schematic representation of pURA2in, die base vector is 
constructed in pNEB193 which contains die 8 bp recognition sequences for AscI, Pad 
and Pme L URA3A (SEQ ID NO: 105) and CYP52A2A (SEQ ID NO: 86) do not 
contain diese 8 bp recognition sites. URA3A is inverted so diat die transforming fragment 
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will attempt to recircularize prior to integration. An Asc I/Pme /fragment was used to 

transform H5343 ura. 

Figure 28 shows a scheme to detect integration of CYP52A2A gene (SEQ 
ID NO: 86) into the genome of H5343 ura. In all cases, hybridization band intensity could 

reflect the number of integrations. 

Figure 29 is a diagrammatic representation of the plasmid pPA57 containing 
the truncated <JYPJ2A3A gene present in the pTriplEx vector. A detailed restriction map 
of only the sequenced region is shown at the top. The bar indicates the open reading 
frame. The direction of transcription is indicated by an arrow under die open reading 
frame. 

Figure 30 is a diagrammatic representation of the plasmid pPA62 containing 
die truncated CYP52A3B gtnc present in the pTriplEx vector. A detailed restriction map 
of only die sequenced region is shown at die top. The bar indicates the open reading 
frame. The direction of ti^anscription is indicated by an arrow under die open reading 
frame. 

Figure 31 is a diagrammatic representation of die plasmid pPAL3 
containing the ti-uncated CYPJ2A5A gene present in die pTriplEx vector. A detailed 
restriction map of only die sequenced region is shown at die top. The bar indicates die 
open reading frame. The direction of transcription is indicated by an arrow under die 

open reading frame. 

Figure 32 is a diagrammatic representation of tiie plasmid pPA5 containing 
die truncated CYPS2ASA gene present in die pTriplEx vector. A detailed restiiction map 
of only die sequenced region is shown at die top. The bar indicates die open reading 
frame. The direction of transcription is indicated by an arrow under die open reading 
frame. 

Figure 33 is a diagrammatic representation of die plasmid pPAlS containing 
die truncated CYFS2D4A gene present in die pTriplEx vector. A detailed restiiction map 
of only die sequenced region is shown at die top. The bar indicates die open reading 
frame. The direction of transcription is indicated by an arrow under die open reading 
frame. 
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Figure 34 is a graph showing the expression of CYP52A1 (SEQ ID NO: 
85), Cm24^(SEQ ID NO: 86) and CYPJ2A5gcnQ% (SEQ ID NOS: 90 and 91) from 
C. tropicalis 20962 in a fermentor run upon the addition of amounts of die substrate oleic 
acid or tridecane in a spiking experiment. 

Figure 35 depicts a scheme used for the extraction and analysis of diacids 
and monoacids from fermentation broths. 

Figure 36 is a graph showing the induction of expression of CYPS2A1A, 
CYP52A2A and CYPS2ASA in a fermentor run upon addition of die substrate 
octadecane. No induction of CYPJ2A3A or CYPJ2A3B wa.s observed under these 
conditions. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 
Diacid productivity is improved according to die present invention by 
selectively increasing enzymes which are known to be important to die oxidation of organic 
substrates such as fatty acids composing die desired feed. According to die present 
invention, ten CYF genes and two CFR genes of C. tropjcalis haye been identified and 
characterized diat relate to participation in die co-hydroxylase complex catalyzing die first 
step in die Q-oxidation padiway. In addition, a novel quantitative competitive reverse 
transcription polymerase chain reaction (QC-RT-PCR) assay is used to measure gene 
expression in die fermentor under conditions of induction by one or more organic 
substrates as defmed herein. Based upon QC-RT-PCR results, diree CYP genes, 
CYPS2A1, CYPJ2A2 and CYPS2AS, have been identified as being of greater importance 
for die ©-oxidation of long chain fatty acids. Amplification of die CPR gene copy number 
improves productivity. The QC-RT-PCR assay indicates diatbodi CYP and CPF genes 
appear to be under tight regulatory control. 

In accordance widi die present invention, a mediod for discriminating 
members of a gene family by quantifying die amount of target mRNA in a sample is 
provided which includes a) providing an organism containing a target gene; b) cultiiring 
the organism with an organic substirate which causes upregulation in die activity of the 
target gene; c) obtaining a sample of total RNA from the organism at a first point in time; 
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d) combining at least a portion of the sample of the total RNA with a known amount of 
competitor RNA,to form an RNA mixture, wherein the competitor RNA is substantially 
similar to the target mRNA but has a lesser number of nucleotides compared to the target 
mRNA; e) adding reverse transcriptase to the RNA mixture in a quantity sufficient to 
form corresponding target DNA and competitor DNA; (1) conducting a polymerase chain 
reaction in the presence of at least one primer specific for at least one substantially non- 
homologous region of the target DNA widiin die gene family, die primer also specific for 
die competitor DNA; g) repeating steps (c-f) using increasing amounts of the competitor 
RNA while maintaining a substantially constant amount of target RNA; h) determining 
the point at which the amount of target DNA is substantially equal to the amount of 
competitor DNA; i) quantifying the results by comparing tiie ratio of the concentration of 
unknown target to the known concentration of competitor; and j) obtaining a sample of 
total RNA from the organism at another point in time and repeating steps (d-i). 

In addition, modification of existing promoters and/or die isolation of 
alternative promoters provides increased expression of CYP and CPU genes. Strong 
promoters are obtained from at least four sources: random or specific modifications of tiie 
CYPS2A2promotev, CYPJ2ASpromoter, CYPS2A1 promoter, tiie selection of a strong 
promoter from available Candida jioiddahon genes such as POXd^xvd. POX5, or 
screening to select anotiier suitable Candida, promoter. 

Promoter sfrength can be directiy measured using QT-RT-PCR to measure 
CTPand CPR gene expression in Candida cells isolated from fermentors. Enzymatic 
assays and antibodies specific for CYPsnd proteins are used to verify tiiat increased 
promoter strengtii is reflected by increased syndiesis of die corresponding enzymes. Once 
a suitable promoter is identified, it is fused to tiie selected CYPwd CPR genes and 
introduced into Candida for construction of a new improved production strain. It is 
contemplated diat die coding region of die CYPsnd CPR genes can be fused to suitable 
promoters or other regulatory sequences which are well known to tiiose skilled in die art. 

In accordance widi die present invention, studies on C tropicalis KTCC 
20336 have identified six unique CITgenes and four potential alleles. QC-RT-PCR 
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analyses of cells isolated during the course of the fermentation bioconversions indicate that 
at least three of the CYPgtms are induced by fatty acids and at least two of the CYF genes 
are induced by alkanes. See Figure 34. Two of the genes are highly induced 
indicating participation in the oo-hydroxylase complex which catalyzes the rate limiting step 
in the oxidation of fatty acids to the corresponding diacids. 

The biochemical characterizations of each P450 enzyme herein is used to 
tailor the 6! tropicalis host for optimal diacid productivity and is used to select P450 
enzymes to be amplified based upon the fatty acid content of the feedstream. Cjy gene(s) 
encoding P450 enzymes tiiat have a low specific activity for the fatty acid or alkane 
substrate of choice are targeted for inactivation, thereby reducing the physiological load on 
the cell. 

Since it has been demonstrated that CPR can be limiting in yeast systems^ 
the removal of non-essential P450s from the system can free electrons that are being used 
by non-essential P450s and make them available to the P450s important for diacid 
productivity. Moreover, the removal of non-essential P450s can make available other 
necessary but potentially limiting components of the P450 system (i.e., available membrane 
space, heme and/or NADPH). 

Diacid productivity is thus improved by selective integration, amplification, 
and over expression of CKPand CPif genes in the C. tropicalis production host 

It should be understood that host cells into which one or more copies of 
desired C}7and/or CPR genes have been introduced can be made to include such genes 
by any technique known to those skilled in the art. For example, suitable host cells include 
procaryotes such as Bacillus sp, Pseudomous sp., Actinomycetes sp., Eschericia sp., 
Mycobacterium sp., and eukaryotes such as yeast, algae, insect cells, plant cells and and 
filamentous fungi. Suitable host cells are preferably yeast cells such as Yarrowia, 
Bebaromyces^ Saccharomyces^ Schizosaccharomyces^ and Pichia and more preferably 
those of the Candida genus. Preferred species of Candida are tropicalis^ maltosa, apicola, 
paratropicalis, albicans^ cloacae, guitiermondiij intermedia, lipolytica, parapsilosis and 
zeylenoides. Certain preferred stains of Candida tropicalis are listed in U.S. Patent No. 
5,254,466, incorporated herein by reference. 
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Vectors such as plasmids, phagemids, phages or cosmids can be used to 
transform or transfect suitable host cells. Host cells may also be transformed by 
introducing into a cell a linear DNA vector(s) containing the desired gene sequence. Such 
linear DNA may be advantageous when it is desirable to avoid introduction of non-native 
(foreign) DNA into the cell. For example, DNA consisting of a desired target gene(s) 
flanked by DNA sequences which are native to the cell can be introduced into the cell by 
electroporation, lidiium acetate transformation, spheroplasting and the like. Flanking 
DNA sequences can include selectable markers and/or other tools for genetic engineering. 

It should be understood that, depending on whether a transformed 
organism utilizes the universal genetic code or the non-universal genetic code known, e.g., 
in connection with Q tropicalis, slight differences can be manifest in the amino acid 
sequences of protein-products. Thus, nucleotide sequences containing a CTG codon 
produce proteins containing a CTG encoded leucine in prokaryotes such as E. coZr and a 
CTG encoded serine in non-universal coding eukaryotes such as C. tropicaJis, For 
example, the CYP52A1A gene contains one CTG codon starting at position 1354 which is 
translated as a leucine in £ co// and a serine in C. tropicalis, leading to two versions of the 
CYP52A1A protein (SEQ. ID. NO: 95 and SEQ. ID. NO: 110); die CYP52A3B g^^n^ 
contains one CTG codon starting at position 2449 which is translated as a leucine in E, coli 
and a serine in C. tropicalis, leading to two versions of the CYP52A3B protein (SEQ. ID. 
NO: 99 and SEQ. ID NO: 111); tire CYP32A5A gene contains two CTG codons starting, 
respectively, at positions 1883 and 2570, which are translated as leucine in E. co// and 
serine in C tropicalisy leading to two versions of die CYP52A5A protein (SEQ. ID. NO: 

100 and SEQ. ID. NO: 112); the CYPS2ASBgcn^ contains two CTG codons starting, 
respectively, at positions 1922 and 2609, which are translated as leucine in E, colisnd 
serine in C tropicalis, leading to two versions of die CYP52A5B protein (SEQ. ID, NO: 

101 and SEQ. ID. NO: 113); die CYP52A8Ag^nt contains one CTG codon starting at 
position 659, which is translated as a leucine in E. co// and a serine in C tropicalis^ leading 
to two versions of die CYP52A8B protein (SEQ. ID. NO: 103 and SEQ. ID. NO: 115); 
the CYPS2D4A gene contains three CTG codons starting, respectively, at positions 1247, 
1412 and 1757, which are translated as leucine in £ co/i and as serine in Q tropicalis, 
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leading to two versions of the CYP5234A protein (SEQ. ID. NO: 104 and SEQ. ID. NO: 
116); the CPRA (NCPIA) gene contains one CTG codon starting at position 1 153 which is 
translated as a leucine in E. coJj'md as a serine in C tropicalis, leading to two versions of 
die CPRA (NCPIA) protein (SEQ. ID. NO: 83 and SEQ. ID. NO: 117); the CPRG 
(NCPIB) gene contains one CTG codon starting at position 1 180 which is translated as a 
leucine in E, coA'and as a serine in C, tropicalis, leading to tw^o versions of the CPRB 
(NCPIB) protein (SEQ. ID. NO: 84 and SEQ. ID. NO: 118). 

A suitable organic substrate herein can be any organic compound that is 
biooxidizable to a mono- or polycarboxylic acid. Such a compound can be any saturated 
or unsaturated aliphatic compound or any carbocyclic or heterocyclic aromatic compound 
having at least one terminal methyl group, a terminal carboxyl group and/or a terminal 
functional group which is oxidizable to a carboxyl group by biooxidation. A terminal 
functional group which is a derivative of a carboxyl group may be present in the substrate 
molecule and may be converted to a carboxyl group by a reaction other than biooxidation. 
For example, if the terminal group is an ester that neither the wild-type C, tropicalis nor 
the genetic modifications described herein will allow hydrolysis of the ester functionality to 
a carboxyl group, then a lipase can be added during the fermentation step to liberate free 
fatty acids. Suitable organic substrates include, but are not limited to, saturated fatty acids, 
unsaturated fatty acids, alkanes, alkenes, alkynes and combinations thereof. 

Alkanes are a type of saturated organic substrate which are useful herein. 
The alkanes can be linear or cyclic, branched or straight chain, substituted or 
unsubstituted. Particularly preferred alkanes are those having from about 4 to about 25 
carbon atoms, examples of which include but are not limited to butane, hexane, octane, 
nonane, dodecane, tridecane, tetradecane, octadecane and the like. 

Examples of unsaturated organic substrates which can be used herein 
include but are not limited to internal olefins such as 2-pentene, 2-hexene, 3-hexene, 9- 
octadecene and the like; unsaturated carboxylic acids such as 2-hexenoic acid and esters 
thereof, oleic acid and esters thereof including triglyceryl esters having a relatively high oleic 
acid content, erucic acid and esters thereof including triglyceryl esters having a relatively 
high erucic acid content, ricinoleic acid and esters thereof including triglyceryl esters having 
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a relatively hi^ ricinoleic acid content, linoleic acid and esters thereof including triglyceryl 
esters having a relatively high linoleic acid content; unsaturated alcohols such as 3-hexen-l- 
ol, 9-octadecen-l-ol and the like; unsaturated aldehydes such as 3-hexen-l-al, 9-octadecen- 
1-al and the like. In addition to the above, an organic substrate which can be used herein 
include alicyclic compounds having at least one internal carbon-carbon double bond and at 
least one terminal metfiyl group, a terminal carboxyl group and/or a terminal functional 
group vrhich is oxidizable to a carboxyl group by biooxidation. Examples of such 
compounds include but are not limited to 3,6-dimethyl, 1,4-cyclohexadiene; 3- 
methylcyclohexene; 3-mediyl-l, 4-cyclohexadiene and the like. 

Examples of the aromatic compounds that can be used herein include but 
are not limited to arenes such as o-, m-, p-xylene; o-, m-, p-metiiyl benzoic acid; dimethyl 
pyridine, and the like. The organic substrate can also contain other functional groups that 
are biooxidizable to carboxyl groups such as an aldehyde or alcohol group. The organic 
substrate can also contain other functional groups that are not biooxidizable to carboxyl 
groups and do not interfere with the biooxidation such as halogens, ethers, and the like. 

Examples of saturated fatty acids which may be applied to cells 
incorporating the present ClPand CPi? genes include caproic, enanthic, caprylic, 
pelargonic, capric, imdecylic, lauric, myristic, pentadecanoic, palmitic, margaric, stearic, 
arachidic, behenic acids and combinations thereof. Examples of xmsaturated fatty acids 
which may be applied to cells incorporating the present CYF ^nd CPi? genes include 
palmitoleic, oleic, erucic, linoleic, linolenic acids and combinations thereof, Alkanes and 
fractions of alkanes may be applied which include chain links from CI 2 to C24 in any 
combination. An example of a preferred fatty acid mixtures are Emersol® 267 and 
Tallow, both conmiercially available from Henkel Chemicals Group, Cincinnati, OH. The 
typical fatty acid composition of Emersol® 267 and Tallow is as follows: 
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TALLOW 


E267 


C14:0 


3.5% 


2.4% 


C14:l 


L0% 


0,7% 


C15:0 


0.5% 




C16:0 


25.5% 


4.6% 


C16:l 


4.0% 


5.7% 


Ci7:0 


2.5% 





C17:l 




5,7% 


C18:0 


19.5% 


L0% 


C18:l 


4L0% 


69.9% 


C18:2 


2.5% 


8.8% 


C18:3 




0.3% 


C20:0 


0.5% 




C20:l 




0.9% 



The following examples are meant to illustrate but not to limit the 
invention. All relevant microbial strains and plasmids are described in Table 1 and Table 
2, respectively. 
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Table 1. List of Escherichia coli and Candida tropicalis strains 



E. Coli 
STRAIN 


GENOTYPE 


SOURCE 


XLlBlue- 
MRF' 


endAlj gyrA96, hsdRl?, he, recAl, 
relAl, supE44, thi-h [F kcIZMlS, 
proAB,Tnl(i 


Stratagene, La Jolk, CA 


BM25.8 


SupE44, thi (lac-proAB) [F traD36, 

pwAE, hcIZ MlJi 

Xm2m4S4 (kaif)Pl (canf) hsdR (nir-nhis- 

) 


Clontech, Palo Alto, CA 


XLOLR 


(mcrA)183 (mcrCB'hsdSMR'nuT)173 
endAl thi-I recAl gyrA96 relAl lac 
lFproABIacIZMIJTnIO(rQt) Su 
(nonsuppressing ^'(lambda resistant) 


Stratagene, La JoUa, CA 



C* tropicalis 
STRAIN 






ATCC20336 


Wild-type 


American Type Culture 
Collection, Rockville, MD 


ATCC750 


Wild-type 


American Type Culture 
Collection, Rockville, MD 


ATCC 20962 


uraBA/uraSB, 

pox4A::uraBA/pox4B::ura3A, 
poxJ::ura3A/poxJ::URA3A 


Henkel 


H5343 ura- 


ura3A/ura3Bf 

pox4A::umoA/pox4B::uradA, 
pox5::ura3A/poxS::URA3A, um3- 


Henkel 


HDCi 


iua3A/ura3B, 

pox4A::ijra3A/pox4B::ura3A, 

poxJ::ura3A/poxS::UBA3A, 

ura3::URA3A-CYPS2A2A 


Henkel 


HDC5 


uraSA/uraSB, 

pox4A::ura3A/pox4B::ura3Af 

poxS::iira3A/poxS::URA3A, 

ura3::URA3A-CYPS2A3A 


Henkel 


HDCIO 


ura3A/ura3B, 

pox4A::ura3A/pox4B::ura3Af 

pox5::ura3A/poxJ::URA3A, 

ma3::URA3A-CPRB 


Henkel 


HDC15 


ura3A/ura3Bf 

pox4A::iira3A/pox4B::uraBA, 

poxS::uraBA/poxS::URA3A, 

ura3::imA3A'CYPS2ASA 


Henkel 


HDC20 


ura3A/ura3B, 

pox4A::ura3A/pox4B::ura3A, 
poxS::ura3A/poxS::UIiA3Ai 
uia3::URA3A-CYPS2A2A -f- CPE B 
(CYPmd CRff have opposite 5' to 3' 
orieiitation with respect to each other) 


Henkel 
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HDC23 


um3A/ura3B^ 


Henkel 




pox4A::ura3A/pox4B::ura3Aj 






poxJ::ura3A/poxJ::URA3A, 






ura3::URA3A'CYPS2A2A^ CPR B 






(CFPand CPRhzve same 5' to 3' 






orientation with respect to each other) 





Table 2. list of plasmids isolated from genomic libraries and constructed for use 
in gene integrations. 



Plasmid 


Base 
vector 


Insert 


Insert 
Size 


Plasmid 
size 


Description 


pURAin 


pNEBi93 


URA3A 


1706 bp 


4399 bp 


pNEB193 ^dth the URA3A gene 
inserted in the As^ - Pmel site, 
generating a Pad site 


pURA 2in 


pURAin 


CYPS2A2 
A 


2230 bp 


6629 bp 


pURAin containing a PGR 
CYP52A2A allele containing 
Pad restriction sites 


pURA 
REDB in 


pURAin 


CPRB 


3266 bp 


7665 bp 


pURAin containing a PGR 
CPRB allele containing Pad 
restriction sites 


pHKMl 


pTriplEx 


Truncated 
CPRA gene 


Approx. 
3.8 kb 


Approx. 
7.4 kb 


A truncated CPRA gene 
obtained by first screening Hbrary 
containing the 5' untranslated 
region and 1 .2 kb open reading 
frame 


pHKM4 


PTriplEx 


Truncated 
CPRA gene 


Approx. 
5kb 


Approx. 
8.6 kb 


A truncated CPRA gene 
obtained by screening second 
library containing the 3' 
untranslated region end sequence 


pHKM9 


pBC- 
CMV 


CPRB 
gene 


Approx. 
5.3 

kb 


Approx. 
9.8 kb 


CPRB allele isolated from the 
third library 


pHKMll 


pBC- 
CMV 


CYP52A1 
A 


Approx. 
5kb 


Approx. 
9.5 kb 


CYPS2A1A isolated from the 
third Hbrary 


pHKM12 


pBC- 
CMV 


CYP52A8 
A 


Approx. 
7.5 

kb 


Approx. 
12 kb 


CYPS2A8A isolated from the 
third library 


pHKM13 


pBC- 
CMV 


CYP52D4 
A 


Approx. 
7.3 

kb 


Approx. 
11.8 kb 


CYPS2D4A isolated from the 
third hbrary 


pHKMM 


pBC- 
CMV 


CYP52A2 
B 


Approx. 
6kb 


Approx. 
10.5 kb 


CYPS2A2B isolated from the 
third hbrary 


pHKM15 


pBC- 
CMV 


CYP52A8 
B 


Approx. 
6.6 

kb 


Approx. 
11.1 kb 


CYPS2A8B isolated from the 
third Hbrary 
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pPAL3 


pTriplEx 


CYP52AS 
A 


44 kb 


Approx. 
8,1 kb 


CYPS2ASA isolated from the 1st 
library 


pPA5 


pTriplEx 


CYP52AS 
B 


4.1 kb 


Approx. 
7.8 kb 


CYP52A5B isolated from the 
2nd library 


pPA15 


pTriplEx 


CYPJ2A2 
A 


6.0 kb 


Approx. 
9.7 kb 


CYPJ2A2A isolated from the 
2nd library 


pPA57 


pTriplEx 


CYP52A3 
A 


5.5 kb 


Approx. 
9.2 kb 


CYP52A3A isolated from the 
2nd library 


pPA62 


pTriplEx 


CYPS2A3 
B 


6.0 kb 


Approx. 
9.7 kb 


CYPS2A3B isolated from the 
2nd library 



EXAMPLE 1 

Purification of Genomic DNA from Candida tropicalis ATCC 20336 
A. Construction of Genomic Libraries 

50 ml of YEPD broth (see Table 9) was inoculated with a single colony of 
C. tropicalis 20336 from YEPD agar plate and grown overnight at 30''C. 5 ml of the 
overnight culture was inoculated into 100 ml of fresh YEPD broth and incubated at 30°C 
for 4 to 5 hr with shaking. Cells were harvested by centrifugation, washed twice with 
sterile distilled water and resuspended in 4 ml of spheroplasting buffer (1 M Sorbitol, 50 
mM EDTA, 14 mM mercaptoethanol) and incubated for 30 min at 37 °C with gentle 
shaking, 0.5 ml of 2 mg/ml zymolyase (ICN Pharmaceuticals, Inc., Irvine, CA) was 
added and incubated at 37°C with gentle shaking for 30 to 60 min. Spheroplast 
formation was monitored by SDS lysis. Spheroplasts were harvested by brief 
centrifugation (4,000 rpm, 3 min) and were washed once with the spheroplast buffer 
without mercaptoethanol. Harvested spheroplasts were then suspended in 4 ml of lysis 
buffer (0.2 M Tris/pH 8.0, 50 mM EDTA, 1% SDS) containing 100 [ig/ml RNase 
(Qiagen Inc., Chatsworth, CA) and incubated at 37'^C for 30 to 60 min. 

Proteins were denatured and extracted twice with an equal volume of 
chloroform/isoaniyl alcohol (24:1) by gently mixing the two phases by hand inversions. 
The two phases were separated by centrifugation at 10,000 rpm for 10 min and the 
aqueous phase containing the high-molecular weight DNA was recovered. To the 
aqueous layer NaCl was added to a final concentration of 0.2 M and the DNA was 
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precipitated by adding 2 vol of ethanol. Precipitated DNA was spooled with a clean glass 
rod and resuspended in TE buffer (10 mM Tris/pH 8.0, 1 mM EDTA) and allowed to 
dissolve overnight at 4°C. To the dissolved DNA, RNase free of any DNase activity 
(Qiagen Inc., Chatsworth, CA) was added to a final concentration of 50 |xg/ml and 
incubated at 37 °C for 30 min. Then protease (Qiagen Inc., Chatsworth, CA) was added 
to a final concentration of 100 [xg/ml and incubated at 55 to 60°C for 30 min. The 
solution was extracted once with an equal volume of phenol/chloroform/isoamyl alcohol 
(25:24:1) and once with equal volume of chloroform/isoamyl alcohol (24:1). To the 
aqueous phase 0.1 vol of 3 M sodium acetate and 2 volumes of ice cold ethanol (200 
proof) were added and the high molecular weight DNA was spooled with a glass rod and 
dissolved in 1 to 2 ml of TE buffer. 

B. Genomic DNA Preparation for PGR 
Amplification of CYF^nd CP/? Genes 

Five 5 ml of YPD medium was inoculated with a single colony and grown at 

30 °C overnight. The culture was centrifuged for 5 min at 1200 x g. The supernatant was 

removed by aspiradon and 0.5 ml of a sorbitol solution (0.9 M sorbitol, 0.1 M Tris-Cl pH 

8.0, 0.1 M EDTA) was added to the pellet. The pellet was resuspended by vortexing and 1 

}xl of 2-mercaptoethanol and 50 jal of a 10 ixg/m\ zymolyase solution were added to the 

mixture. The tube was incubated at 37° C for 1 hr on a rotary shaker (200 rpm). The 

tube was then centrifuged for 5 min at 1200 x g and the supernatant was removed by 

aspiration. The protoplast pellet was resuspended in 0.5 ml Ix TE (10 mM Tris-Cl pH 

8.0, 1 mM EDTA) and transferred to a 1.5 ml microcentrifuge tube. The protoplasts were 

lysed by the addition of 50 |il 10% SDS followed by incubation at 65°C for 20 min. Next, 

200 |al of 5M potassium acetate was added and after mixing, the tube was incubated on ice 

for at least 30 min. Cellular debris was removed by centrifugation at 13,000 x g for 5 min. 

The supernatant was carefully removed and transferred to a new microfuge tube. The 

DNA was precipitated by the addition of 1 ml 100% (200 proof) ethanol followed by 

centrifugation for 5 min at 13,000 x g. The DNA pellet was washed with 1 ml 70 % 
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ethanol followed by centrifugation for 5 min at 13,000 x g. After partially drying the DNA 
under a vacuum, it was resuspended in 200 \x\ of Ix TE. The DNA concentration was 
determined by ratio of the absorbance at 260 nm / 280 nm (A350/280), 

EXAMPLE 2 

Construction of Candida tropicalis 20336 Genomic Libraries 

Three genomic libraries of C. tropicalis were constructed, two at Clontech 
Laboratories, Inc., (Palo Alto, CA) and one at Henkel Corporation (Cincinnati, OH). 

A. Clontech Libraries 

The first Clontech library was made as follows: Genomic DNA was 
prepared from C tropicalis 20336 as described above, partially digested with EcdEl and 
size fractionated by gel electrophoresis to eliminate fragments smaller than 0.6 kb. 
Following size fractionation, several ligations of the Eco^ genomic DNA fragments and 
lambda QC) TriplEx™ vector (Figure 1) arms with ^coRI sticky ends were packaged 
into X phage heads under conditions designed to obtain one million independent clones. 
The second genomic library was constructed as follows: Genomic DNA was digested 
partially with SauSAl and size fractionated by gel electrophoresis. The DNA fragments 
were blunt ended using standard protocols as described, e.g., in Sambrook et al, 
Molecular Cloning: A Laboratory Manual, 2ed. Cold Spring Harbor Press, USA (1989), 
incorporated herein by reference. The strategy was to fill in the Sau3Al overhangs with 
Klenow polymerase (Life Technologies, Grand Island, NY) followed by digestion with 
SI nuclease (Life Technologies, Grand Island, NY). After SI nuclease digestion the 
fragments were end filled one more time with Klenow polymerase to obtain the final 
blunt-ended DNA fragments. £coRI linkers were Ugated to these blunt-ended DNA 
fragments followed by Hgation into the ylTriplEx vector. The resultant library contained 
approximately 2 X 10^ independent clones with an average insert size of 4.5 kb. 
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B. Henkel Library 

The third genomic library was constructed at Henkel Corporation using 
IZAP Express™ vector (Stratagene, La JoUa, CA) (Figure 2), Genomic DNA was 
partially digested with SauSAl and fragments in the range of 6 to 12 kb were purified 
from an agarose gel after electrophoresis of the digested DNA. These DNA fragments 
were then ligated to BamUl digested XZAP Express™ vector arms according to 
manufacturers protocols. Three hgations were set up to obtain approximately 9.8 X 10^ 
independent clones. All three libraries were pooled and amplified according to 
manufacturer instructions to obtain high-titre (>10^ plaque forming units/ml) stock for 
long-term storage. The titre of packaged phage library was ascertained after infection of 
E, coli XLlBlue-MRF' cells. E, coli XLlBlue-MRF' were grown overnight in either in 
LB medium or NZCYM (Table 9) containing 10 mM MgS04 and 0.2% maltose at 37X 
or 30°C, respectively with shaking. Cells were then centrifiiged and resuspended in 0,5 
to 1 volume of 10 mM MgS04. 200 \x\ of this E. coli culture was mixed with several 
dilutions of packaged phage library and incubated at 37°C for 15 min. To this mixture 
2.5 ml of LB top agarose or NZCYM top agarose (maintained at 60°C ) (see Table 9) 
was added and plated on LB agar or NCZYM agar (see Table 9) present in 82 mm petri 
dishes. Phage were allowed to propagate overnight at ST^'C to obtain discrete plaques 
and the phage titre was determined. 

EXAMPLE 3 
Screening of Genomic Libraries 

Both X^TriplEx™ and A-ZAP Express™ vectors are phagemid vectors that 
can be propagated either as phage or plasmid DNA (after conversion of phage to 
plasmid). Therefore, the genomic libraries constructed in these vectors can be screened 
either by plaque hybridization (screening of lambda form of library) or by colony 
hybridization (screening plasmid form of library after phage to plasmid conversion). 
Both vectors are capable of expressing the cloned genes and the main difference is the 
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mechanism of excision of plasmid from the phage DNA, The cloning site in A^TriplEx™ 
is located within a plasmid which is present in the phage and is flanked by loxP site 
(Figure 1). When XTriplEx™ is introduced into E. coli strain BM25.8 (supplied by 
Clontech), the Cre recombinase present in BM25.8 promotes the excision and 
circularization of plasmid pTriplEx from the phage >uTriplEx™ at the loxP sites. The 
mechanism of excision of plasmid pBK-CMV from phage IZAP Express™ is different. 
It requires the assistance of a helper phage such as ExAssist™ (Stratagene) and an E, coli 
strain such as XLOR (Stratagene). Both pTriplEx and pBK-CMVcan repUcate 
autonomously in E. coli, 

A. Screening Genomic Libraries (Plasmid Form) 
1) Colony Lifts 

A single colony of E, coli BM25.8 was inoculated into 5 ml of LB 
containing 50 lag/ml kanamycin, 10 mM MgS04 and 0.1% maltose and grown overnight 
at 3r C, 250 rpm. To 200 lal of this overnight culture (^4X10' cells) 1 \i\ of phage 
library (2-5X10^ plaque forming units) and 150 |il LB broth were added and incubated 
at 3 1 ""C for 30 min after which 400 |li1 of LB broth was added and incubated at 3 1 ""C , 
225 rpm for 1 h. This bacterial culture was diluted and plated on LB agar containing 50 
|ig/ml ampicillin (Sigma Chemical Company, St. Louis, MO) and kanamycin (Sigma 
Chemical Company) to obtain 500 to 600 colonies/plate. The plates were incubated at 
37''C for 6 to 7 hrs until the colonies became visible. The plates were then stored at 4''C 
for L5 h before placing a Colony/Plaque Screen™ Hybridization Transfer Membrane disc 
(DuPont NEN Research Products, Boston, MA) on the plate in contact with bacterial 
colonies. The transfer of colonies to the membrane was allowed to proceed for 3 to 5 min. 
The membrane was then lifted and placed on a fi'esh LB agar (see Table 9) plate 
containing 200 |j.g/ml of chloramphenicol with the side exposed to the bacterial colonies 
facing up. The plates containing the membranes were then incubated at 37°C overnight 
in order to allow fiiU development of the bacterial colonies. The LB agar plates ft'om 
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which colonies were initially Ufted were incubated at 37''C overnight and stored at 4''C 
for future use. The following morning the membranes containing bacterial colonies were 
lifted and placed on two sheets of Whatman 3M (Whatman, Hillsboro, OR) paper 
saturated with 0.5 N NaOH and left at room temperature (RT) for 3 to 6 min to lyse the 
cells. Additional treatment of membranes was as described in the protocol provided by 
NEN Research Products, 

2) DNA Hybridizations 

Membranes were dried overnight before hybridizing to oligonucleotide 
probes prepared using a non-radioactive ECL™ 3'-oligolabelling and detection system 
from Amersham Life Sciences (Arlington Heights, IL). DNA labeling, prehybridization 
and hybridizations were performed according to manufacturer's protocols. After 
hybridization, membranes were washed twice at room temperature in 5 X SSC, 0.1% 
SDS (in a volume equivalent to 2 ml/cm^ of membrane) for 5 min each followed by two 
washes at 50°C in IX SSC, 0.1% SDS (in a volume equivalent to 2 ml/cm^ of membrane) 
for 15 min each. The hybridization signal was then generated and detected with 
Hyperfilm ECL™ (Amersham) according to manufacturer's protocols. Membranes were 
aligned to plates containing bacterial colonies from which colony lifts were performed 
and colonies corresponding to positive signals on X-ray were then isolated and 
propagated in LB broth. Plasmid DNA's were isolated from these cultures and analyzed 
by restriction enzyme digestions and by DNA sequencing. 

B. Screening Genomic Libraries (Plaque Form) 
1) X, Library Plating 

E, coll XLlBlue-MRF' cells were grown overnight in LB medium (25 ml) 
containing 10 mM MgS04 and 0.2% maltose at 37°C, 250 rpm. Cells were then 
centrifuged (2,200 x g for 10 min) and resuspended in 0.5 volumes of 10 mM MgS04, 
500 jLil of this £. coli culture was mixed with a phage suspension containing 25,000 
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amplified lambda phage particles and incubated at 37'' C for 15 min. To this mixture 6.5 
ml of NZCYM top agarose (maintained at 60 "C) (see Chart) was added and plated on 80 
- 1 00 ml NCZYM agar (see Chart) present in a 150 mm petridish. Phage were allowed 
to propagate overnight at 37 °C to obtain discrete plaques. After overnight growth plates 
were stored in a refrigerator for 1-2 hr before plaque lifts were performed. 

2) Plaque Lift and DNA Hybridizations 

Magna Lift™ nylon membranes (Micron Separations, Inc., Westborough, 
MA) were placed on the agar surface in complete contact with X plaques and transfer of 
plaques to nylon membranes was allowed to proceed for 5 min at RT, After plaque 
trmsfer the membrane was placed on 2 sheets of Whatman 3M™ (Whatman, Hillsboro, 
OR) filter paper saturated with a 0.5 N NaOH, LO M NaCl solution and left for 10 min at 
RT to denature DNA. Excess denaturing solution was removed by blotting briefly on 
dry Whatman 3M paper. Membranes were then transferred to 2 sheets of Whatman 3M™ 
paper saturated with 0.5 M Tris-HCl (pH 8.0), L5 M NaCl and left for 5 min to 
neutraUze, Membranes were then briefly washed in 200 - 500 ml of 2 X SSC, dried by 
air and baked for 30 - 40 min at 80 °C. The membranes were then probed with labelled 
DNA. 

Membranes were prewashed with a 200 - 500 ml solution of 5 X SSC, 
0.5% SDS, 1 mM EDTA (pH 8.0) for 1 - 2 hr at 42°C with shaking (60 rpm) to get rid of 
bacterial debris from the membranes. The membranes were prehybridized for 1 - 2 hr at 
42 °C with (in a volume equivalent to 0.125 - 0.25 ml/cm' of membrane) ECL Gold™ 
buffer (Amersham) containing 0.5 M NaCl and 5% blocking reagent. DNA firagments 
that were used as probes were purified ft-om agarose gel using a QIAEX II™ gel 
extraction kit (Qragen Inc., Chatsworth, CA) according to manufacturers protocol and 
labeled using an Amersham ECL™ direct nucleic acid labeling kit (Amersham). Labeled 
DNA (5-10 ng/ml hybridization solution) was added to the prehybridized membranes 
and the hybridization was allowed to proceed overnight. The following day membranes 
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were washed with shaking (60 rpm) twice at 42°C for 20 min each time in (in a volume 
equivalent to 2 ml/cm^ of membrane) a buffer containing either 0.1 (high stringency) or 
0.5 (low stringency) X SSC, 0.4% SDS and 360 g/1 urea. This was followed by two 5 
min washes at room temperature in (in a volume equivalent to 2 ml/cm^ of membrane) 2 
X SSC. Hybridization signals were generated using the ECL™ nucleic acid detection 
reagent and detected using Hyperfilm ECL™ (Amersham). 

Agar plugs which contained plaques corresponding to positive signals on 
the X-ray fibn were taken from the master plates using the broad-end of Pasteur pipet. 
Plaques were selected by aligning the plates with the x-ray fibn. At this stage, multiple 
plaques were generally taken. Phage particles were eluted from the agar plugs by soaking 
in 1 ml SM buffer (Sambrook et al., supra) overnight. The phage eluate was then diluted 
and plated with freshly grown E, coli XLlBlue-MRF' cells to obtain 100 - 500 plaques 
per 85 mm NCZYM agar plate. Plaques were transferred to Magna Lift nylon 
membranes as before and probed again using the same probe. Single well-isolated 
plaques corresponding to signals on X - ray fihn were picked by removing agar plugs 
and eluting the phage by soaking ovemight in 0.5 ml SM buffer. 

C, Conversion of X Clones to Plasmid Form 

The lambda clones isolated were converted to plasmid form for further 
analysis. Conversion from the plaque to the plasmid form was accomplished by infecting 
the plaques into £. coli strain BM25.8. The E. coli strain was grown ovemight at 31 °C, 
250 rpm in LB broth containing 10 mM MgS04 and 0.2% maltose until the OD^oo reached 
L 1 - 1 .4. Ten millihters of the ovemight culture was removed and mixed with 100 \A of 
1 M MgCl2. A 200 jal volume of cells was removed, mixed with 1 50 |al of eluted phage 
suspension and incubated at 3 1 °C for 30 min. LB broth (400 was added to the tube 
and incubation was continued at 31 ""C for 1 hr with shaking, 250 rpm. 1 - 10 |al of the 
infected cell suspension was plated on LB agar containing 100 p.g/ml ampicillin (Sigma, 
St. Louis, MO). Well-isolated colonies were picked and grown ovemight in 5 ml LB 
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broth containing 100 ^ig/ml ampicillin at 37°C, 250 rpm. Plasmid DNA was isolated 
from these cultures and analyzed. To convert the XZKP Express™ vector to plasmid 
form E. coli strains XLlBlue-MRF' and XLOR were used. The conversion was 
performed according to the manufacturer's (Stratagene) protocols for single-plaque 
excision. 

EXAMPLE 4 

Transformation of C. tropicalis H5343 ura 
A. Transformation of C tropicalis H5343 by Electroporation 

5 ml of YEPD was inoculated with Q tropicalis H5343 ura- from a frozen 
stock and incubated overnight on a New Brunswick shaker at 30*^0 and 170 rpm. The next 
day, 10 fil of die overnight culture was inoculated into 100 ml YEPD and growdi was 
continued at 30 170 rpm. The following day the cells were harvested at an ODeoo of 
1.0 and the cell pellet was washed one time with sterile ice-cold water. The cells were 
resuspended in ice-cold sterile 35 % Polyethylene glycol (4,000 MW) to a density of 5x10' 
cells/ml. A 0.1 ml volume of cells were utilized for each electroporation. The following 
electroporation protocol was followed: 1.0 iJ.g of transforming DNA was added to 0.1 ml 
cells, along with 5 /^g denatured, sheared calf thymus DNA and the mixture was allowed to 
incubate on ice for 15 min. The cell solution was then transferred to an ice-cold 0.2 cm 
electroporation cuvette, tapped to make sure the solution was on the bottom of the cuvette 
and electroporated. The cells were electroporated using an Invitrogen electroporator 
(Carlsbad, CA) at 450 Volts, 200 Ohms and 250 //F. Following electroporation, 0.9 ml 
SOS media (IM Sorbitol, 30% YEPD, 10 mM CaCL) was added to die suspension. The 
resulting culture was grown for 1 hr at 30°C, 170 rpm. Following the incubation, the cells 
were pelleted by centrifugation at 1500 x g for 5 min. The electroporated cells were 
resuspended in 0.2 ml of IM sorbitol and plated on syndietic complete media minus uracil 
(SC - uracil) (Nelson, supra). In some cases the electroporated cells were plated direcdy 
onto SC - uracil. Growth of transformants was monitored for 5 days. After three days, 
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several transformants were picked and transferred to SC-uracil plates for genomic DNA 
preparation and screening. 

B. Transformation of C. tropicalis Using Lithium Acetate 

The following protocol was used to transform C tropicalis in accordance 
with the procedures described in Current Protocols in Molecular Biology, Supplement 5, 
13.7.1 (1989), incorporated herein by reference. 

5 ml of YEPD was inoculated widi C. tropicalis H5343 urar from a frozen 
stock and incubated overnight on a New Brunswick shaker at 30°C and 170 rpm. The next 
day, 10 jxl of die overnight culture was inoculated into 50 ml YEPD and growth was 
continued at 30 ""C, 170 rpm. The following day the cells were harvested at an ODsoo of LO. 
The culture was transferred to a 50 ml polypropylene tube and centrifuged at 1000 X g for 
10 min. The cell pellet was resuspended in 10 ml sterile TE (lOmM Tris-Cl and ImM 
EDTA, pH 8.0). The cells were again centrifuged at 1000 X g for 10 min and the cell 
pellet was resuspended in 10 ml of a sterile lithium acetate solution [LiAc ( 0.1 M lithium 
acetate, 10 mM Tris-Cl, pH 8.0, 1 mM EDTA)]. Following centrifugation at 1000 X g for 
10 min., the pellet was resuspended in 0.5 nal liAc, This solution was incubated for one 
hour at 30°C while shaking gently at 50 rpm, A 0.1 ml aliquot of this suspension was 
incubated with 5 pig of transforming DNA at 30°C widi no shaking for 30 min. A 07 ml 
PEG solution (40 % wt/vol polyethylene glycol 3340, 0.1 M lithium acetate, 10 mM Tris-Cl, 
pH 8.0, 1 mM EDTA) was added and incubated at 30°C for 45 min. The tubes were then 
placed at 42°C for 5 min. A 0.2 ml aliquot was plated on synthetic complete media minus 
uracil (SC - uracil) (Kaiser et al. Methods in Yeast Genetics, Cold Spring Harbor 
Laboratory Press, USA, 1994, incorporated herein by reference). Growth of transformants 
was monitored for 5 days. After three days, several transformants were picked and 
transferred to SCruracil plates for genomic DNA preparation and screening. 

EXAMPLE 5 
Plasmid DNA Isolation 

Plasmid DNA were isolated from E, coli cultures using Qiagen plasmid 
isolation kit (Qiagen Inc., Chatsworth, CA) according to manufacturer's instructions. 
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EXAMPLE 6 
DNA Sequencing and Analysis 

DNA sequencing was performed at Sequetech Corporation (Mountain 
View, CA) using Applied Biosystems automated sequencer (Perkin Elmer, Foster City, 
CA). DNA sequences were analyzed with Mac Vector and Gene Works software packages 
(Oxford Molecular Group, Campbell, CA). 

EXAMPLE 7 
PGR Protocols 

PGR amplification was carried out in a Perkin Elmer Thermocycler using 
the AmpliJa^Gold enzyme (Perkin Ehner Cetus, Foster City, CA) kit according to 
manufacturer's specifications. Following successful amplification, in some cases, the 
products were digested with the appropriate enzymes and gel purified using QiaexII 
(Qiagen, Chatsworth, CA) as per manufacturer instructions. In specific cases the Ultma 
Taq polymerase (Perkin Elmer Cetus, Foster City, CA) or the Expand Hi-Fi Taq 
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polymerase (Boehringer Mamiheim, Indianapolis, IN) were used per manufacturer's 
recommendations or as defined in Table 3. 



Table 3. PGR amplification conditions used with different primer combinations. 



PRIMER 
COMBINATION 


Tag 


TEMPLATE 
DENATURING 
CONDITION 


ANNEALING 
TEMP/TIME 


EXTENSION 
TEMP/TIME 


CYCLE 
Number 


367441-1/41-2/41-4 + 
3674-41-4 


Amph-Taq 
Gold 


94 C/30 sec 


55 C/30 sec 


72 C/1 min 


30 


URA rnmer la 
URA Primer lb 


Ampli-7ag 
Gold 


95 C/1 min 


70 C/1 min 


72 C/2 min 


35 


URA Primer 2a 
URA Primer 2b 


Ampli-7ag 
Gold 


95 C/1 min 


70 C/1 min 


72 C/2 min 


35 


CYP2A#1 
CYHM2 


Ampli-Ta^ 
Gold 


95 C/1 min 


70 C/1 min 


72 C/2 min 


35 


CYFQMl 
CYIQA#2 


Ultma Tag 


95 C/1 min 


70 C/1 min 


72 C/1 min 


30 


CPJtBn 


Expand 

Hi-Fi 

Tag 


94 C/15 sec 
94 C/15 sec 


50 C/30 sec 
50 C/30 sec 


68 C/3 min 
68 C/3 min 
+20 sec/cycle 


10 
15 


cmA#i 

CT/5A#2 


Expand 

Hi-Fi 

Tag 


94 C/15 sec 
94 C/15 sec 


50 C/30 sec 
50 C/30 sec 


68 C/3 min 
68 C/3 min 
+20 sec/cycle 


10 
15 
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Table 4 below contains a list of primers (SEQ ID NOS: 1-35) used for PGR amplification 
to construct gene integration vectors or to generate probes for gene detection and isolation. 

Table 4, Primer table for PGR amplification to construct gene integration vectors, to 
generate probes for gene isolation and detection and to obtain DNA sequence of 
constructs. (A- deoxyadenosine triphosphate [dATP], G- deoxyguanosine triphosphate 
[dGTPl, G- deoxycytosine triphosphate [dGTP], T- deoxydiymidine triphosphate [dTTP], 
Y- dGTP or dTTP, R- dATP or dGTP, W- dATP or dTTP, M- dATP or dGTP, N- 
dATP or dGTP or dGTP or dTTP). 



Target 
gene(s) 


Patent 
Primer 
Name 


Lab Primer 
Name 


Sequence (5* to 3') 


PGR 
Product 
Size 












CYFJ2A2 
A 


CYP2A#1 


3659-72M 


CCTT^TTAiATGCACGAAGCGGAGATAAAAG 
(SEQ ID NO: 1) 


2230 bp 




CYP2A#2 


3659-72N 


CCTTA47TA4GCATAAGCTTGCTCGAGTCT 
(SEQ ID NO: 2) 














CYPJ2A3 
A 


CYP3A#1 


3659-720 


6'C7X4/17TA4ACGCAATGGGAACATGGAGTG 
(SEQ ID NO: 3) 


2154 bp 




CYP3A#2 


3659-72? 


CCTTAA 77::i/lTCGCACTACGGTTATTGGTATCAG 
(SEQ ID NO: 4) 














CYP52A5 
A 


CYP5A#1 


3659-72K 


CC7TA47TA4TCAAAGTACGTTCAGGCGG 
(SEQ ID NO: 5) 


3298 bp 




CYP5A#2 


3659-72L 


CCTTAA TTAIGGCAGACAACAACTTGGCAAAGTC 
(SEQ ID NO: 6) 














CPRB 


CPRB#1 


3698-20A 


CCTTM^TT^GAGGTCGTrGGiTGAGl'liiU 
(SEQ ID NO: 7) 


3266 bp 




CPRB#2 


3698-20B 


CC7TA17TA1TTGATAATGACGTTGCGGG 
(SEQ ID NO: 8) 














URA3A 


URA 
Primer la 


3698-7C 


^6^6^C6^CGC6GGAGTCCAAAAAGACCAACCTCTG 
(SEQ ID NO: 9) 


956 bp 




URA 
Primer lb 


3698.7D 


C677:4^7TA4TACGTGGATACCTTCAAGCAAGTG 
(SEQ ID NO: 10) 














URA3A 


URA 
Primer 2a 


3698-7A 


CCTTAlTXiylGCTCACGAGTri'lGGGAI 1 1 ICJGA 
G 

(SEQ ID NO: 11) 


750 bp 




URA 
Primer 2b 


3698-7B 


(S^6'6'7TT.4^6CGCAGAGGTTGGTCTTTTTGGAC 
TC 

(SEQ ID NO: 12) 




















GGGTTTAAAC' Pme I restriction site 
(SEQ ID NO: 13) 
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• 




AGGCGCGCC- Asd restriction site 
(SEQIDNO:14) 










CCTTAATFAA • Pad restriction site 
(SEQIDNO:15) 




CPR 


FMNl 


3674-41-1 


TCYCAAACWGGTACWGCWGAA 
(SEQIDNO:16) 




CPR 


FMN2 


367441-2 


GGnTGGGTAAYTCWACTTAT 
(SEQIDNO:17) 




CPR 


FAD 


3674-41-3 


CGTTATTAYTCYAl'i rCTTC 
(SEQ ID NO: 18) 




CPR 


NADPH 


3674-41-4 


GCMACACCRGTACCTGGACC 
(SEQ ID NO: 19) 




CPR 


PRK1.F3 


PRK1.F3 


ATCCCAATCGTAATCAGC 
(SEQ ID NO: 20) 




CPR 


PRK1.F5 


PRK1.F5 


ACTTGTCTTCG'l ■ 1 ' 1 aGC A 
(SEQ ID NO: 21) 




CPR 


PRK4.R20 


PRK4.R20 


CTACGTCTGTGGTGATGC 
(SEOIDNO:22) 




CYP 


UCupl 


UCupl 


CGNGAYACNACNGCNGG 
(SEQ ID NO: 23) 




CYP 


UCup2 


UCup2 


AGRGAYACNACNGCNGG 
(SEQ ID NO: 24) 




CYP 


UCdo^ral 


UCdownl 


AGNGCRAAYTGYTGNCC 
(fiVn TD NO: 25) 




CYP 


UCdowTi2 


UCdo«n2 


YAANGCRAAYTGYTGNCC 
(SEQIDl^IO:26) 




CYP 


HemeBl 


HemeBl 


ATTCAACGGTGGTCCAAGAATCTGi l I GU 
(SEQ ID NO: 27) 




CYP 


2,3,5P 


2,3,5P 


GAGCTATGTTGAGACCACAGTTrGC 
(SEQ ID NO: 28) 




CYP 


2,3,5M 


2,3,5M 


CTTCAGTTAAAGCAAATTGl l iGGCC 
(SEQ ID NO: 29) 




pTriplEx 
vector 


Triplex5' 


Triplex5' 


CTCGGGAAGCGCGCCATTGTGTTGG 
(SEQ ID NO: 30) 




pTripiEx 
vector 


TriplexS' 


TriplexS' 


TAATACGACTCACTATAGGGCGAATTGGC 
(SEQ ID NO: 31) 




CYP 


Cyp62a 


Cyp52a 


TGRYTCAAACCATCTYTCTGG 
(SEQ ID NO: 32) 




CYP 


C>p52b 


C>'p52b 


GGACCGGCG 11 AAAULrU 
(SEQ ID NO: 33) 




CYP 


Cyp52c 


Cyp52c 


CATAGTCGWATYATGCTTAGACC 
(fivn TD NO: 34) 




CYP 


Cypo2d 


Cyp52d 


GGACCACCATTGAATGG 
(SEQ ID NO: 35) 
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EXAMPLE 8 



Yeast Colony PGR Procedure for Confirmation of Gene 
Integration into the Genome of C._ tropicalis 

Single yeast colonies were removed from the surface of transformation 
plates, suspended in 50 lA of spheroplasting buffer (50mM KCl, lOmM Tris-HCl, pH 8.3, 
1.0 mg^ml Zymolyase, 5% glycerol) and incubated at 37'^C for 30 min. Following 
incubation, the solution was heated for 10 min at 95 °C to lyse the cells. Five lA of diis 
solution was used as a template in PCR. Expand Hi-Fi Taq polymerase (Boehringer 
Mannheim, Indianapolis, IN) was used in PCR coupled with a gene-specific primer (gene 
to be integrated) and a C//Mc? primer. If integration did occur, amplification would yield a 
PCR product of predicted size confirming the presence of an integrated gene. 

EXAMPLE 9 

Fermentation Method for Gene Induction Studies 
A fermentor was charged with a semi-synthetic growth medium having the 
composition 75 g/1 glucose (anhydrous), 6.7 g/I Yeast Nitrogen Base (Difco Laboratories), 3 
g/1 yeast extract, 3 g^ anunonium sulfate, 2 g/1 monopotassium phosphate, 0.5 g/1 sodium 
chloride. Components were made as concentrated solutions for autoclaving then added to 
the fermentor upon cooling: final pH approximately 5.2. This charge was inoculated with 
540% of an overnight culture of C tropicalis KTCC 20962 prepared in YM medium 
(Difco Laboratories) as described in the methods of Examples 17 and 20 of US Patent 
5,254,466, which is incorporated herein by reference. Q tropicalis KYCC 20962 is a POX 
4 and POX 5 disrupted C, tropicalis KYCC 20336, Air and agitation were supplied to 
maintain the diss6lved oxygen at greater than about 40% of saturation versus air. The pH 
was maintained at about 5,0 to 8.5 by the addition of 5N caustic soda on pH control. Both 
a fatty acid feedstream (commercial oleic acid in this example) having a typical 
composition: 2.4% Cu; 0.7% Cut; 4.6% 5,7% C^e^\ 5J% Cni; L0% C.s; 69.9% Ci^i; 8,8% 
Gas; 0.30% Ci83: 0»90% Cmi and a glucose co-substrate feed were added in a feedbatch mode 
beginning near the end of exponential growth. Caustic was added on pH control during 
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hundred microliters of each ethanol treated sample were transferred to a Qiagen RNeasy 
spin column, followed by centrifugation at 8,000 x g for 15 sec. The flow through was 
discarded and the column reloaded with the remaining sample (700 |xl) and re- 
centrifuged at 8,000 x g for 15 sec. The column was washed once with 700 ^1 of buffer 
RWl*, and centrifiiged at 8,000 x g for 15 sec and the flow through discarded. The 
column was placed in a new 2 ml collection tube and washed with 500 ^il of RPE* buffer 
and the flow through discarded. The RPE* wash was repeated with centrifugation at 
8,000 X g for 2 min and the flow through discarded. The spin column was transferred to a 
new 1.5 ml collection tube and 100 |al of RNase free water added to the column followed 
by centrifugation at 8,000 x g for 15 seconds. An additional 75 ^il of RNase free water 
was added to the column followed by centrifugation at 8,000 x g for 2 min. RNA eluted 
in the water flow through was collected for flirther purification. 

The RNA eluate was then treated to remove contaminating DNA. Twenty 
microliters of lOX DNase I buffer (0.5 M tris (pH 7.5), 50 mM CaCl^, 100 mM MgCl^), 
10 jj,l of RNase-free DNase I (2 Units/|xl, Ambion Inc., Austin, Texas) and 40 units 
Rnasin (Promega Corporation, Madison, Wisconsin) were added to the RNA sample. 
The mixture was then incubated at 37 °C for 15 to 30 min. Samples were placed on ice 
and 250^1 Lysis buffer RLT* and 250^1 ethanol (200 proof) added. The samples were 
then mixed by inversion. The samples were transferred to Qiagen RNeasy spin columns 
and centrifuged at 8,000 x g for 15 sec and the flow through discarded. Columns were 
placed in new 2 ml collection tubes and washed twice with 500 ^1 of RPE* wash buffer 
and the flow through discarded. Columns were transferred to new 1.5 ml eppendorf 
tubes and RNA was eluated by the addition of 100 ^1 of DEPC treated water followed by 
centrifugation at 8,000 x g for 15 sec. Residual RNA was collected by adding an 
additional 50 |al of RNase free water to the spin column followed by centrifixgation at fiall 
speed for 2 min. 10 \i\ of the RNA preparation was removed and quantified by the {A^2>o) 
method. RNA was stored at-70°C. Yields were found to be 30-100 ^g total RNA per 
2.0 ml of fermentation broth. 
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the bioconversion of fatty acids to diacids to maintain the pH in die desired range. 
Typically, samples for gene induction studies were collected just prior to starting the fatty 
acid feed and over the first 10 hours of bioconversion. Determination of fatty acid and 
diacid content was determined by a standard methyl ester protocol using gas liquid 
chromatography (GLC). Gene induction was measured using die QC-RT-PCR protocol 
described in this application. 

EXAMPLE 10 
RNA Preparation 

The first step of this protocol involves the isolation of total cellular RNA 
from cultures of C tropicalis. The cellular RNA was isolated using the Qiagen RNeasy 
Mini Kit (Qiagen hic, Chatsworth, CA) as follows: 2 ml samples of C tropicalis cultures 
were collected from the fermentor in a standard 2 ml screw capped Eppendorf style tubes 
at various times before and after the addition of the fatty acid or alkane substrate. Cell 
samples were immediately frozen in Hquid nitrogen or a dry-ice/alcohol bath after their 
harvesting from the fermentor. To isolate total RNA from the samples, the tubes were 
allowed to thaw on ice and the cells pelleted by centriftigation in a microfuge for 5 
minutes (min) at 4°C and the supernatant was discarded while keeping the pellet ice-cold. 
The microftige tubes were filled 2/3 fixU with ice-cold Zirconia/Silica beads (0.5 mm 
diameter, Biospec Products, Bartlesville, OK) and the tube filled to the top with ice-cold 
RLT* lysis buffer (* buffer included with the Qiagen RNeasy Mini Kit). Cell rupture 
was achieved by placing the samples in a mini bead beater (Biospec Products, 
Bartlesville, OK) and immediately homogenized at fiiU speed for 2,5 min. The samples 
were allowed to cool in a ice water bath for 1 minute and the homogenization/cool 
process repeated two more times for a total of 7.5 min homogenization time in the 
beadbeater. The homogenized cells samples were microfiiged at fixU speed for 10 min 
and 700 p,l of the RNA containing supernatant removed and transferred to a new 
eppendorf tube. 700 jiil of 70% ethanol was added to each sample followed by mixing by 
inversion. This and all subsequent steps were performed at room temperature. Seven 
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EXAMPLE 11 



Quantitative Competitive Reverse Transcription Polymerase 
Chain Reaction (QC-RT-PCR) Protocol 

QC-RT-PCR is a technique used to quantitate the amount of a specific 
RNA in a RNA sample. This technique employs the synthesis of a specific DNA 
molecule that is complementary to an RNA molecule in the original sample by reverse 
transcription and its subsequent amplification by polymerase chain reaction. By the 
addition of various amounts of a competitor RNA molecule to the sample one can 
determine the concentration of the RNA molecule of interest (in this case the mRNA 
transcripts of the CYP and CPR genes). The levels of specific mRNA transcripts were 
assayed over time in response to the addition of fatty acid and/or alkane substrates to the 
growth medium of fermentation grown C. tropicalis cultures for the identification and 
characterization of the genes involved in the oxidation of these substrates. This approach 
can be used to identify the CYP and CPR genes involved in the oxidation of any given 
substrate based upon their transcriptional regulation. 

A. Primer Design 

The first requirement for QC-RT-PCR is the design of the primer pairs to 
be used in the reverse transcription and subsequent PCR reactions. These primers need to 
be unique and specific to the gene of interest. As there is a family of genetically similar 
CYP genes present in C. tropicalis 20336, care had to be taken to design primer pairs that 
would be discriminating and only ampUfy the gene of interest, in this example the 
CYP52A5 gene. In this manner, unique primers directed to substantially non-homologous 
(aka variable) regions within target members of a gene family are constructed. What 
constitutes substantially non-homologous regions is determined on a case by case basis. 
Such unique primers should be specific enough to anneal the non-homologous region of 
the target gene without annealing to other non-target members of the gene family. By 
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comparing the known sequences of the members of a gene family, non-homologous 
regions are identified and unique primers are constructed which will anneal to those 
regions. It is contemplated that non-homologous regions herein would typically exhibit 
less than about 85% homology but can be more homologous depending on the positions 
which are conserved and stringency of the reaction. After conducting PGR, it may be 
helpful to check the reaction product to assure it represents the unique target gene 
product. If not, the reaction conditions can be altered in terms of stringency to focus the 
reaction to the desired target. Alternatively a new primer or new non-homologous region 
can be chosen. Due to the high level of homology between the genes of the CYP52A 
family, the most variable 5 prime region of the CYP52A5 coding sequence was targeted 
for the design of the primer pairs. In Figure 3, a portion of the 5 prime coding region for 
the CYP52A5A (SEQ ID NO: 36) allele of C. tropicalis 20336 is shown. The boxed 
sequences in Figure 3 are the sequences of the forward and backwards primers (SEQ ID 
NOS: 47 and 48) used to quantitate expression of both alleles of this gene. The actual 
reverse primer (SEQ ID NO: 48) contains one less adenine than that shown in Figure 3. 
Primers used to measure the expression of specific C. tropicalis 20336 genes using the 
QC-RT-PCR protocol are Usted in Table 5 (SEQ ID NOS: 37-58). 
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Table 5. Primer used to measure C. tropicalis gene expression in 
the QC-RT-PCR reactions. 



Primer 
Name 


Direction 


Target 


Sequence 


3737-89F 


F 


CYP52A1A 


CCGATGAAGTnTCGACGAGTACCC 
(SEO ID NO- 37) 


3737-89B 


B 


CYP52A1A 


AAGGCTTTAACGTGTCCAATCTGGTC 
(SEQ ID NO: 38) 


alKzari 


r 


/^VDjZQA OA 


A TT^ A TrTTT* A r* A TA CT^C A PP A A ATOO 
A i 1 A 1 L/Xj-v^^^ AL-A 1 AL> i 1 i 

(SEQ ID NO: 39) 


alk2aB5 


B 


CYPS2A2A 


CGAGATCGTGGATACGCTGGAGTG 

(oJliVi liJ INU: 4U) 


7581-178-3 


F 


CYPS2A3A 


GCCACTCGGTAACnTGTCAGGGAC 
(alLCi ID INU: 41} 


7581-1784 


B 


CYPS2A3A 


CATTGAACTGAGTAGCCAAAACAGCC 
(SLQ ID NO: 42} 


3737-50F 


F ^ 


CYPJ2A3A 
& 

CYP52A3B 


CCTACGTTrGGTATCGCTACTCCGTTG 
ID NU: 4o} 


3737-50B 


B 


CYPSzA8A 
& 


1 1 iCCAGCCAGCACCLri UL/AAIj 
(SEQ ID NO: 44) 


3737-1 75F 


F 


CYP52D4A 


GCAGAGCCGATCTATGTTGCGTCC 


3737-1 75B 


B 


CYPS2D4A 


TCATTGAATGCTTCCAGGAACCTCG 


7581-97-F 


F 


CYP52A5A& 


AAGAGGGCAGGGCTCAAGAG 


7581-97-M 


B 


CYPS2A5A& 


TCCATGTGAAGATCCCATCAC 


4P-2 


F 


CYP52A8A 


CTTG.^GGCCGTGTTGAACG 


4M-1 


B 


CYP52A8A 


CAGGATTrGTCTGAGTTGCCG 


3737-52F 


F ^ 


POX4A & 


CCATTGCCrrGAGATACGCCATTGGTAG 


3737-52B 


B 


POX4A 8c 
POX4B 


AGCCTTGGTGTCGTTCTTTTCAACGG 
(SEQ ID NO: 52) 


3737-53F 


F 


POXSA 


TTGGGTTTGTTTGTTTCCTGTGTCCG 
(SEQ ID NO: 53) 


3737-53B 


B 


POX5A 


CCTTTGACCTTCAATCTGGCGTAGACG 
(SEQ ID NO: 54) 


F33 


F 


CPRA 


GGTTTGCTGAATACGCTGAAGGTGATG 
(SEQ ID NO: 55) 


B63 


B 


CPRA 


TGGAGCTGAACAACTCTCrCGTCTCGG 
(SEQ ID NO: 56) 


3737.133F 


F 


CPRA& 
CPRB 


TTCCTCAACACGGACAGCGG 
(SEQ ID NO: 57) 
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3737-1 33B 


B 


CPFAA 


AGTCAACCAGGTGTGGAACTCGTC 






CPRB 


(SEQIDNO:58) 



F'=Forward B=Backward 



B. Design and Synthesis of the Competitor DNA Template 

The competitor UNA is synthesized in vitro from a competitor DNA 
template that has the T7 polymerase promoter and preferably carries a small deletion of 
e.g., about 10 to 25 nucleotides relative to the native target RNA sequence. The DNA 
template for the in-vitro synthesis of the competitor RNA is synthesized using PCR primers 
that are between 46 and 60 nucleotides in length. In this example, the primer pairs for the 
syndiesis of die CYP52AS compchiov DNA are shovm in Tables 6 and 7 (SEQ ID NOS: 
59 AND 60). 



Table 6. Forward and Reverse primers used to synthesize the competitor RNA template 
for the QC-RT-PCR measurement of CYP52A5A gene expression. 



Forward Primer 


CYPS2ASK 


GGATCCTAATACGACTCACTATAGGGAGG 

AAGAGGGCAGGGCTCAAGAG 

(SEQ ID NO: 59) 


Reverse Primer 


CYP52A5K 


TCCATGTGAAGATCCCATCACGAGTGTGC 

CTCTTGCCCAAAG 

(SEQ ID NO: 60) 



Table 7. Primers for die syndiesis of die QC-RT-PCR competitor RNA templates 



Primer 
Name 


Direction 


Target 


Sequence 5*-3' 


3737-89C 


F 


CYP52A1A 


GGATCCTAATACGACTCACTATAGGGAGGCCGAT 

GAAGTTTTCGACGAGTACCC 

(SEQ ID NO: 61) 


3737-89D 


B 


CYP52A1A 


AAGGCTTTAACGTGTCCAATCTGGTC 
AACATAGCTCTGGAGTGCTTCCAACC 
(SEQ ID NO: 62) 


758M37-A 


F 


CYP52A2A 


GGATCCTAATACGACTCACTATAGGGAGGATTAT 

CGCCACATACITCACCAAATGG 

(SEQ ID NO: 63) 
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7581-137-B 


B 


CYPS2A2A 


CGAGATCGTGGATACGCTGGAGTGCGTCGCTCTT 

CTTCTTCAACAATTCAAG 

(SEQ ID NO: 64) 


7581-137-D 


B 


CYP52A3A 


CATTGAACTGAGTAGCCAAAACAGCCCATGGTTT 

CAATCAATGGGAGGC 

(SEQ ID NO: 65) 


7581-137-C 


F 


CYP52A3A 


GGATCCTAATACGACTCACTATAGGGAGGGCCAC 

TCGGTAACTTTGTCAGGGAC 

(SEQ ID NO: 66) 


3737-50-D 


F 


CYPJ2A3A 
8c 

CYP52A3B 


GGATCCTAATACGACTCACTATAGGGAGGCCTAC 

GTTTGGTATCGCTACTCCGTTG 

(SEQ ID NO: 67) 


3737-50-C 


B 


CYP52A3A 
& 

CYP52A3B 


TTTCCAGCCAGCACCGTCCAAGCAACAAGGAGTA 

CAAGAAATCGTGTC 

(SEQ ID NO: 68) 


3737-1 75C 


F 


CYP52D4A 


GGATCCTAATACGACTCACTATAGGGAGGGCAGA 

GCCGATCTATGTTGCGTCC 

(SEQ ID NO: 69) 


3737-175D 


B 


CYP52D4A 


TCATTGAATGCTTCCAGGAACCTCGCCACATCCAT 

CGAGAACCGG 

(SEQ ID NO: 70) 


7581-97.A 


F 


CYPS2ASA 
& 

CYPS2ASB 


GGATCCTAATACGACTCACTATAGGGAGGAAGAG 

GGCAGGGCTCAAGAG 

(SEQ ID NO: 59) 


7581-97-B 


B 


CYPS2A5A 
& 

CYPJ2ASB 


TCCATGTGAAGATCCCATCACGAGTGTGCCFCTT 

GCCCAAAG 

(SEQ ID NO: 60) 


4P-2/r7 


F 


CYPS2A8A 


GGATCCTAATACGACTCACTATAGGGAGGCTTGA 

AGGCCGTGTTGAACG 

(SEQ ID NO: 71) 


4M-3/4M-1 


B 


CYPJ2A8A 


CAGGATTTGTCTGAGTTGCCGCCTGATCAAGATA 

GGATCCTTGCCG 

(SEQ ID NO: 72) 


3737-26-D 


F 


CPRA 


GGATCCTAATACGACTCACTATAGGGAGGGGTTT 

GCTGAATACGCTGAAGGTGATG 

(SEQ ID NO: 73) 


3737-26-C 


B 


CPRA 


TGGAGCTGAACAACTCTCTCGTCTCGGGTGGTCG 

AATGGACCCTTGGTCAAG 

(SEQ ID NO: 74) 


3737-133C 


F 


CPRA& 
CPRB 


GGATCCTAATACGACTCACTATAGGGAGGTTCCT 

CAACACGGACAGCGG 

(SEQ ID NO: 75) 


3737-133D 


B 


CPRA 8c 
CPRB 


AGTCAACCAGGTGTGGAACTCGTCGGTGGCAACA 

ATGAAAAACACCAAG 

(SEQ ID NO: 76) 


3737-52-C 


F 


POX4A 8c 
POX4B 


GGATCCTAATACGACTCACTATAGGGAGGCCATT 

GCCTTGAGATACGCCATTGGTAG 

(SEQ ID NO: 77) 


3737-52-D 


B 


POX4A & 
POX4B 


AGCCTTGGTGTCGTTCTTTTCAACGGAAGGTGGT 

CTCGATGGTGTGTTCAACC 

(SEQ ID NO: 78) 
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3737-53-C 


F 


POXSA 


GGATCCTAATACGACTCACTATAGGGAGGTTGGG 

TTTGTTTGTTTCCTGTGTCCG 

(SEQ ID NO: 79) 


3737-53-D 


B 


POX5A 


CCTTTGACCTTCAATCTGGCGTAGACGCAGCACC 

ACCGATCCACCACTTG 

(SEQ ID NO: 80) 



F^Forward B=Backword 



The forward primer (SEQ ID NO: 59) contains the T7 promoter consensus sequence 
"GGATCCTAATACGA CTCACTATAGGG AGG" (SEQ ID NO: 109) fused to the primer 
7581-97-F sequence (SEQ ID NO: 47). The Reverse Primer (SEQ ID NO: 60) contains the 
sequence of primer 7581-97M (SEQ ID NO: 48) followed by the 20 bases of upstream sequence 
with a 18 base pair deletion between the two blocks of the CER5^yl<5' sequence. The forward 
primer was used with the corresponding reverse primer to synthesize the competitor DNA 
template. The primer pairs were combined in a standard Taq Gold polymerase PGR reaction 
according to the manufacturer's recommended conditions (Perkin-Elmer/AppKed Biosystems, 
Foster City, CA). The PGR reaction mix contained a final concentration of 250 nM each primer 
and 10 ng C tropicahs (^omo^omA DNA for template. The reaction mixture was placed in a 
thermocycler for 25 to 35 cycles using the higliest annealing temperature possible during the 
PGR reactions to assure a homogeneous PGR product (in this case 62°C). The PGR products 
were either gel purified or filtered purified to remove \m-incorporated nucleotides and primers. 
The competitor template DNA was then quantified using the (A250/230) method. Primers used in 
QC-RT-PCR experiments for the synthesis of various competitive DNA templates are listed in 
Table 7 (SEQ ID NOS: 61-80). 
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C. Synthesis of the Competitor EUSA 

Competitor template DNA was transcribed In-Vitro to make the competitor RNA 
using the Megascript T7 kit from Ambion Biosciences (Ambion Inc., Austin, Texas). 250 
nanograms (ng) of competitor DNA template and the in-vitro transcription reagents are mixed 
according to the directions provided by the manufacturer. The reaction mixture was incubated 
for 4 hours at 37°C. The resulting RNA preparations were then checked by gel electrophoresis 
for the conditions giving the highest yields and quality of competitor RNA. This often required 
optimization according to the manufacturer's specifications. The DNA template was then 
removed using DNase I as described in the Ambion kit. The RNA competitor was then 
quantified by the (A^mo) method. Serial dilution's of the RNA (1 ng/^l to 1 femtogram (fg)/nl) 
were made for use in the QC-RT-PCR reactions and the original stocks stored at -70°C. 



D. QC-RT-PCR Reactions 

QC-RT-PCR reactions were performed using rTth polymerase from Perkin- 
Elmer(Perkin-Ekner/AppIied Biosystems, Foster City, CA) according to the manufacturer's 
recommended conditions. The reverse transcription reaction was performed in a 10 ^il volume 

c 

with a final concentrations of 200 for each dNTP, 1.25 units rTth polymerase, 1.0 mM 
MnCl» IX of the lOX buffer suppUed with the Enzyme from the manufacturer, 
100 ng of total RNA isolated from a fermentor grown culture of C. tropicalis and 1 .25 of 
the appropriate reverse primer. To quantitate CYP52A5 expression in C. tropicalis an 
appropriate reverse primer was 7581-97M (SEQ ID NO: 48). Several reaction mixes were 
prepared for each RNA sample characterized. To quantitate CYP52A5 expression a series of 8 
to 12 of the previously described QC-RT-PCR reaction mixes were aliquoted to different 
reaction tubes. To each tube 1 ^l of a serial dilution containing from 100 pg tolOO fg CYP52A5 
competitor RNA per ^1 was added bringing the final reaction mixtures up to the final volume of 
10 |xl. The QC-RT-PCR reaction mixmres were mked and incubated at 70°C for 15 min 
according to the manufacturer's recommended times for reverse transcription to occur. At die 
completion of die 15 minute incubation, the sample temperature was reduced to 4''C to stop die 
reaction and 40 ^il of die PCR reaction mix added to die reaction to bring die total volume up to 
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50 jil. The PGR reaction mix consists of an aqueous solution containing 0.3125 (iM of the 
forward primer 7581-97? (SEQ ID NO: 47), 3.125 mM MgCL and IX chelating buffer supplied 
with the enzyme from Perkin-Elmer. The reaction mixtures were placed in a thermocycler 
(Perkin-Elmer GeneAmp PGR System 2400, Perkin-Elmer/Applied Biosystems, Foster Gity, CA 
) and the following PGR cycle performed: 94'*C for 1 min. followed by 94''C for 10 seconds 
followed by 58°G for 40 seconds for 17 to 22 cycles. The PGR reaction was completed widi a 
final incubation at 58°G for 2 min followed by 4'^C. In some reactions where no detectable PGR 
products were produced the samples were returned the thermocycler for additional cycles, this 
process was repeated until enough PGR products were produced to quantify using HPLG. The 
number of cycles necessary to produce enough PGR product is a function of the amount of die 
target mRNA in the 100 ng of total cellular RNA. In cultures where die CYPSMSgm^ is highly 
expressed there is sufficient CYPS2AJmRNA message present and less PGR cycles (<17) are 
required to produce quantifiable amount of PGR product The lower the concentrations of the 
target mRNA present the more PGR cycles are required to produce a detectable amount of 
product These QG-RT-PGR procedures were applied to all the target genes listed in Table 5 
using the respective primers indicated therein, 

E. HPLC Quantification 

Upon completion of the QC-RT-PGR reactions the samples were analyzed and 
quantitated by HPLG. Five to fifteen microliters of the QG-RT-PGR reaction mix was injected 
into a Waters Bio-Gonipatible 625 HPLG with an attached Waters 484 tunable detector. The 
detector was set to measure a wave length of 254 nm. The HPLC contained a Sarasep brand 
DNASep™ column (Sarasep, Inc., San Jose, GA) which was placed witiiin the oven and the 
temperature set for 52 °C. The column was installed according to the manufacturer's 
recommendation of having 30 cm. of heated PEEK tubing installed between the injector and the 
column. The system was configured with a Sarasep brand Guard column positioned before the 
injector. In addition, there was a 0.22 [im filter disk just before the column, within the oven* 
Two Buffers were used to create an elution gradient to resolve and quantitate the PGR products 
from the QG-RT-PGR reactions. BufFer-A consists of 0.1 M tri-ethyl ammonium acetate (TEAA) 
and 5% acetonitrile (volume to volume). BufTer-B consists of 0.1 M TEAA and 25% acetonitrile 
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(volume to volume). The QC-RT-PCR samples were injected into the HPLC and the linear 
gradient of 75% bufFer-A/ 25% buffer-B to 45% bufFer-A/ 55% B was run over 6 min at a flow rate 
of 0.85 ml per minute. The QC-RT-PCR product of die competitor RNA being 18 base pairs 
smaller is eluted from die HPLC column before die QC-RT-PCR product from the CYP52AS 
mRNA(U). The amount of die QC-RT-PCR products are plotted and quantitated widi an 
attached Waters Corporation 745 data module. The log ratios of the amount of CYP52AS 
mRNA QC-RT-PCR product (U) to competitor QC-RT-PCR product (C), as measured by peak 
areas, was plotted and die amount of competitor RNA required to equal die amount of CYP52A5 
mRNA product determined. In the case of each of the target genes listed in Table 5, the 
competitor RNA contained fewer base pairs as compared to the native target mRNA and eluted 
before the native mRNA in a manner similar to that demonstrated by CYP52A5, HPLC 
quantification of the genes was conducted as above. 



EXAMPLE 12 
Evaluation of New Strains in Shake Flasks 

The CYP and CPR amphfied strains such as strains HDCIO, HDC15, HDC20 
and HDC23 (Table 1) and H5343 were evaluated for diacid production in shake flasks. A single 
colony for each strain was transferred from a YPD agar plate into 5 ml of YPD broth and grown 
overnight at 30°C, 250 rpm. An inoculum was then transferred into 50 ml of DCA2 medium 
(Table 9) and grown for 24 h at 30°C, 300 rpm. The cells were centrifuged at 5000 rpm for 5 
min and resuspended in 50 ml of DCA3 medium (Table 9) and grown for 24 h at 30^C, 300 
rpm. 3% oleic acid w/v was added after 24 h growth in DCA3 medium and the cultures were 
allowed to bioconvert oleic acid for 48 h. Samples were harvested and the diacid and monoacid 
concentrations were analyzed as per the scheme given in Figure 35. Each strain was tested in 
duplicate and the results shown in Table 8 represent the average value from two flasks. 
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Table 8, Bioconversion of oleic acid by different recombinant strains of Candida tropicalis 



Strain 


Conversion to 


Specific Conversion 




Oleic diacid 


(g diacid/g biomass 








H5343 


41.9 


0.53 


HDC 10-2 


50.5 


0.85 


HDC 15 


54.4 


0.85 


HDC 20-1 


45.1 


0.72 


HDC 20-2 


45.3 


0.58 


HDC 23-2 


55.2 


0.84 


HDC 23-3 


58.8 


0.89 



EXAMPLE 13 

Cloning and Characterization of C. tropicalis 20336 Cytochrome P450 
Monooxygenase {CYP) and Cytochrome P450 NADPH Oxidoreductase {CPR) Genes 

To clone CYP and CPR genes several different strategies were employed. 
Available CYP amino acid sequences were aligned and regions of similarity were observed 
(Figure 4). These regions corresponded to described conserved regions seen in other 
cytochrome P450 families (Goeptar et al, supra and Kalb et aL supra). Proteins from eight 
eukaryotic cytocfirome P450 famihes share a segmented region of sequence similarity. One 
region corresponded to the HR2 domain containing the invariant cysteine residue near the 
carboxyl terminus which is required for heme binding while the other region corresponded to 
the central region of the I helix thought to be involved in substrate recognition (Figure 4). 
Degenerate oHgonucleotide primers corresponding to these highly conserved regions of the 
CYP52 gene family present in Candida maltosa and Candida tropicalis ATCC 750 were 
designed and used to amplify DNA fragments of CYP genes from C tropicalis 20336 genomic 
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DNA. These discrete PGR fragments were then used as probes to isolate full-length CYP 
genes from the C. tropicalis 20336 genomic libraries. In a few instances oligonucleotide 
primers corresponding to highly conserved regions were directly used as probes to isolate full- 
length CYP genes from genomic libraries. In the case of CPR a heterologous probe based upon 
the known DNA sequence for the CPR gene from C. tropicalis was used to isolate the C. 
tropicalis 20336 CPR gene. 

A. Cloning of the CPR Gene from C. tropicalis 20336 
1) Cloning of the CPiL4 Allele 

Approximately 25,000 phage particles from the first genomic library of C. 
tropicalis 20336 were screened with a 1.9 kb BamRl-Ndel fragment from plasmid pCU3RED 
(See Picattagio et al., Bio/Technology 10:894-898 (1992), incorporated herein by reference) 
containing most of the C. tropicalis 750 CPR gene. Five clones that hybridized to the probe 
were isolated and the plasmid DNA from these lambda clones was rescued and characterized by 
restriction enzyme analysis. The restriction enzyme analysis suggested that all five clones were 
identical but it was not clear that a complete CPR gene was present. 

PGR analysis was used to determine if a complete CPR gene was present in any 
of the five clones. Degenerate primers were prepared for highly conserved regions of known 
CPR genes (See Sutter et al., J. Biol. Chem. 265:16428-16436 (1990), incorporated herein by 
reference) ( Figure 4). Two Primers were synthesized for the FMN binding region (FMNl, 
SEQ ID NO: 16 and FMN2, SEQ ID NO: 17). One primer was synthesized for the FAD 
bindmg region (FAD, SEQ ID NO: 18), and one primer for the NADPH binding region 
(NADPH, SEQ ID NO: 19) (Table 4). These four primers were used m PGR amplification 
experiments using as a template plasmid DNA isolated from four of the five clones described 
above. The FMN (SEQ ID NOS: 16 and 17) and FAD (SEQ ID NO: 18) primers served as 
forward primers and the NADPH primer (SEQ ID NO: 19) as the reverse primer in the PGR 
reactions. When different combinations of forward and reverse primers were used, no PGR 
products were obtained from any of the plasmids. However, all primer combinations amplified 
expected size products with a plasmid containing the C. tropicalis 750 CPR gene (positive 
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control). The most likely reason for the failure of the primer pairs to amplify a product, was 
that all four of clones contained a truncated CPR gene. One of the four clones (pHKMl) was 
sequenced using the Triplex 5' (SEQ ID NO: 30) and the Triplex 3* (SEQ ID NO: 31) primers 
(Table 4) which flank the insert and the multiple cloning site on the cloning vector, and with 
the degenerate primer based upon the NADPH binding site described above. The NADPH 
primer (SEQ ID NO: 19) failed to yield any sequence data and this is consistent with the PGR 
analysis. Sequences obtained with Triplex primers were compared with C. tropicalis 750 CPR 
sequence using the MacVector™ program (Oxford Molecular Group, Campbell, CA). Sequence 
obtained with the Triplex 3' primer (SEQ ID NO: 31) showed similarity to an internal sequence 
of the C tropicalis 750 CPR gene confirming that pHKMl contained a truncated version of a 
20336 CPR gene. pHKMl had a 3.8 kb insert which included a 1.2 kb coding region of the 
CPR gene accompanied by 2.5 kb of upstream DNA (Figure 5). Approximately 0.85 kb of the 
20336 CPR gene encoding the C-terminal portion of the CPR protein is missing from this clone. 

Since the first Clontech library yielded only a truncated CPR gene, the second 
library prepared by Clontech was screened to isolate a full-length CPR gene. Three putative 
CPR clones were obtained. The three clones, having inserts in the range of 5-7 kb, were 
designated pHKM2, pHKM3 and pHKM4. All three were characterized by PCR using the 
degenerate primers described above. Both pHKM2 and pHKM4 gave PCR products with two 
sets of internal primers. pHKM3 gave a PCR product only with the FAD (SEQ ID NO: 1 8) and 
NADPH (SEQ ID NO: 19) primers suggesting that this clone likely contained a truncated CPR 
gene. All three plasmids were partially sequenced using the two Triplex primers and a third 
primer whose sequence was selected from the DNA sequence near the truncated end of the CPR 
gene present in pHKMl. This analysis confirmed that both pHKM2 & 4 have sequences that 
overlap pHKMl and that both contained the 3' region of CPR gene that is missing from 
pHKMl. Portions of inserts from pHKMl and pHKM4 were sequenced and a fiiU-length CPR 
gene was identified. Based on the DNA sequence and PCR analysis, it was concluded that 
pHKMl contained the putative promoter region and 1.2 kb of sequence encoding a portion (5' 
end) of a CPR gene. pHKM4 had 1.1 kb of DNA that overlapped pHKMl and contained the 
remainder (3' end) of a CPR gene along with a downstream untranslated region (Figure 6). 
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Together these two plasmids contained a complete CPRA gene with an upstream promoter 
region. CPRA is 4206 nucleotides in length (SEQ ID NO: 81) and includes a regulatory region 
and a protein coding region (defined by nucleotides 1006-3042) which is 2037 base pairs in 
length and codes for a putative protein of 679 amino acids (SEQ ID NO: 83) (Figures 13 and 
14). In Figure 13, the asterisks denote conserved nucleotides between CPRA and CPRB, bold 
denotes protein coding nucleotides, and the start and stop codons are underlined. The CPRA 
protein, when analyzed by the protein ahgnment program of the Gene Works™ software package 
(Oxford Molecular Group, Campbell, CA), showed extensive homology to CPR proteuis firom 
C tropicalis 750 and C maltosa. 

2) Cloning of the CPRB Allele 

To clone the second CPRB allele, the third genomic library, prepared by 
Henkel, was screened using DNA fragments from pHKMl and pHKM4 as probes. Five clones 
were obtained and these were sequenced with the three internal primers used to sequence CPRA. 
These primers were designated PRK1.F3 (SEQ ID NO: 20) , PRK1.F5 (SEQ ID NO: 21) and 
PRK4.R20 (SEQ ID NO: 22) (Table 4). and the two outside primers (M13 -20 and T3 
[Stratagene]) for the polylinker region present in the pBK-CMV cloning vector. Sequence 
analysis suggested that four of these clones, designated pHKM5 to 8, contained inserts which 
were identical to the CPRA allele isolated earUer. All four seemed to contain a fixll length CPR 
gene. The fifth clone was very similar to the CPRA allele, especially in the open reading frame 
region where the identity was very high. However, there were significant differences in the 5' 
and 3* untranslated regions. This suggested that the fifth clone was the allele to CPRA. The 
plasmid was designated pHKM9 (Figure 7) and a 4T 4 kb region of this plasmid was sequenced 
and the analysis of this sequence confirmed the presence of the CPRB allele (SEQ ID NO: 82), 
which includes a regulatory region and a protein coding region (defined by nucleotides 1033- 
3069) (Figure 13). The amino acid sequence of the CPRB protein is set forth in SEQ ID NO: 
84 (Figure 14). 
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B. Cloning of C tropicalis 20336 {CYF) Genes 

1) Cloning of CYP52A2A, CYP52A3A & 3B and CYP52A5A & SB 

Clones carrying CYP52A2A, ASA, A3B, ASA and ASB genes were 
isolated from the first and second Clontech genomiclibraries using an oligonucleotide probe 
(HemeBl, SEQ ID NO: 27) whose sequence was based upon the amino acid sequence for the 
highly conserved heme binding region present throughout the CYPS2 family. The first and 
second libraries were converted to the plasmid form and screened by colony hybridizations 
using the HemeB 1 probe (SEQ ID NO: 27) (Table 4). Several potential clones were isolated 
and the plasmid DNA was isolated firom these clones and sequenced using the HemeB 1 
oligonucleotide (SEQ ID NO: 27) as a primer. This approach succeeded in identifying five 
CYPS2 genes. Three of the CYP genes appeared unique, while the remaining two were 
classified as alleles. Based upon an arbitrary choice of homology to CYPS2 genes fi*om Candida 
maltosa, these five genes and corresponding plasmids were designated CYPS2A2A (pPA15 
[Figure 26]), CYPS2A3A (pPA57 [Figure 29]), CYPS2A3B (pPA62 [Figure 30]), CYPS2ASA 
(pPAL3 [Figure 31]) and CYPS2ASB (pPA5 [Figure 32]). The complete DNA sequence 
including regulatory and protein coding regions of these five genes was obtained and confirmed 
that all five were CYPS2 genes (Figure 15). In Figure 15, the asterisks denote conserved 
nucleotides among the CYP genes. Bold indicates the protein coding nucleotides of the CYP 
genes, and the start and stop codons are underlined. The CYP52A2A gene as represented by 
SEQ ID NO: 86 has a protein coding region defined by nucleotides 1 199-2767 and the encoded 
protein has an amino acid sequence as set forth in SEQ ID NO: 96. The CYP S 2 A3 A gene as 
represented by SEQ ID NO: 88 has a protein encoding region defined by nucleotides 1 126-2748 
and the encoded protein has an amino acid sequence as set forth in SEQ ID NO: 98. The 
CYPS2A3B gene as represented by SEQ ID NO: 89 has a protein coding defined by nucleotides 
913-2535 and the encoded protein has an amino acid sequence as set forth in SEQ ID NO: 99. 
The CYP52A5A gene as represented by SEQ ID NO: 90 has a protein coding region defined by 
nucleotides 1 103-2656 and the encoded protein has an amino acid sequence as set forth in SEQ 
ID NO: 100. The CYPS2A5B gene as represented by SEQ ID NO: 91 has a protein coding 
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region defined by nucleotides 1 142-2695 and the encoded protein has an amino acid sequence 
as set forth in SEQ ID NO: lOL 

2) Cloning of CYP52A1A and CYP52A8A 

CYP52A1A and CYP52A8A genes were isolated from the third genomic library 
using PCR fragments as probes. The PCR fragment probe for CYP52A1 was generated after 
PCR amplification of 20336 genomic DNA with oligonucleotide primers that were designed to 
amplify a region from the Helix I region to the HR2 region using all available CYP52 genes 
from National Center for Biotechnology hiformation. Degenerate forward primers UCupl 
(SEQ ID NO: 23) and UCup2 (SEQ ID NO: 24) were designed based upon an amino acid 
sequence (-RDTTAG-) from the Helix I region (Table 4). Degenerate primers UCdownl (SEQ 
ID NO: 25) and UCdown2 (SEQ ID NO: 26) were designed based upon an amino acid sequence 
(-GQQFAL-) from the HR2 region (Table 4). For the reverse primers, the DNA sequence 
represents the reverse complement of the corresponding amino acid sequence. These primers 
were used in pairwise combinations in a PCR reaction with Stoffel Taq DNA polymerase 
(Perkin-Elmer Cetus, Foster City, CA) according to the manufacturer's recommended 
procedure, A PCR product of approximately 450 bp was obtained. This product was purified 
from agarose gel using Gene-clean™ (Bio 101, LaJolIa, CA) and ligated to the pTAG™ vector 
(Figure 17) (R&D systems, Minneapohs, MN) according to the recommendations of the 
manufacturer. No treatment was necessary to clone into pTAG because it employs the use of 
the TA cloning technique. Plasmids from several transformants were isolated and their inserts 
were characterized. One plasmid contamed the PCR clone intact. The DNA sequence of the 
PCR fragment (designated 44CFP3, SEQ ID NO: 107) shared homology with the DNA 
sequences for the CYP52A1 gene of C. mctltoso. and the CYP52A3 gene of C tvopiccilis 750. 
This fragment was used as a probe in isolating the C. tropicalis 20336 CYP52A1 homolog. 
The third genomic library was screened using the 44C7P3 PCR probe (SEQ ID NO: 107) and a 
clone (pHKMl 1) that contained a ftill-Iength CYP52 gene was obtained (Figure 8). The clone 
contained a gene having regulatory and protein coding regions. An open reading frame of 1572 
nucleotides encoded a CYP52 protein of 523 amino acids (Figures 15 and 16 ). This CYP52 
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gene was designated CYP52A1A (SEQ ID NO: 85) since its putative amino acid sequence (SEQ 
ID NO: 95) was most similar to the CYP52A1 protein of C maltosa. The protein coding region 
of the CYP52A1A gene is defined by nucleotides 1177-2748 of SEQ ID NO: 85. 

A similar approach was taken to clone CYP52A8A, A PGR fragment probe for 
CYP52A8 was generated using primers for highly conserved sequences of CYP52A3, CYP52A2 
and CYP52A5 genes of C. tropicalis 750, The reverse primer (primer 2,3,5,M) (SEQ ID NO: 
29) was designed based on the highly conserved heme binding region (Table 4). The design of 
the forward primer (primer 2,3,5,P) (SEQ ID NO: 28) was based upon a sequence conserved 
near the N-terminus of the CYP52A3, CYP52A2 and CYP52A5 genes from C tropicalis 750 
(Table 4). Amplification of 20336 genomic DNA with these two primers gave a mixed PGR 
product. One amphfied PGR fragment was 1006 bp long (designated DCA1002). The DNA 
sequence for this fragment was determined and was found to have 85% identity to the DNA 
sequence for the ^ CYP52D4 gene of C tropicalis 750, When this PGR product was used to 
screen the third genomic library one clone (pHKM12) was identified that contained a full- 
length CYP52 gene along with 5' and 3* flanking sequences (Figure 9). The CYP52 gene 
included regulatory and protein coding regions with an open reading frame of 1539 nucleotides 
long which encoded a putative CYP52 protein of 512 amino acids (Figures 15 and 16 ). This 
gene was designated as CYP52A8A (SEQ ID NO: 92) since its amino acid sequence (SEQ ID 
NO: 102) was most similar to the CYP52A8 protein of C. maltosa. The protein coding region of 
the CYP52A8A gene is defined by nucleotides 464-2002 of SEQ ID NO: 92. The amino acid 
sequence of the CYP52A8A protein is set forth in SEQ ID NO: 102. 

3) Cloning of CYP52D4A 

The screening of the second genomic library with the HemeBl (SEQ ID NO: 27) 
primer (Table 4) yielded a clone carrying a plasmid (pP Al 8) that contained a truncated gene 
having homology with the CYP52D4 gene of C. maltosa (Figure 33). A 1 3 to 1 .5-kb £'coRI- 
Sstl fragment from pPA18 containing part of the truncated CYP gene was isolated and used as a 
probe to screen the third genomic library for a frill length CYP 5 2 gene. One clone (pHKM13) 
was isolated and found to contain a fiill-length CYP gene with extensive 5' and 3' flanking 
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sequences (Figure 10). This gene has been designated as CYP52D4A (SEQ ID NO: 94) and the 
complete DNA including regulatory and protein coding regions (coding region defined by 
nucleotides 767-2266) and putative amino acid sequence (SEQ ID NO: 104) of this gene is 
shown in Figures 15 and 16. CYP52D4A (SEQ ID NO: 94) shares the greatest homology with 
the CYP52D4 gene of C maltosa. 

4) Cloning of CYP52A2B and CYP52A8B 

A mixed probe containing CYP52A1A, A2A, ASA, D4A, ASA and ASA genes was 
used to screen the third genomic library and several putative positive clones were identified. 
Seven of these were sequenced with the degenerate primers Cyp52a (SEQ ID NO: 32), Cyp52b 
(SEQ ID NO: 33), Cyp52c (SEQ ED NO: 34) and Cyp52d (SEQ ID NO: 35) shown in Table 4. 
These primers were designed from highly conserved regions of the four CYP52 subfamilies, 
namely CYP52A, B,C&D. Sequences from two clones, pHKM14 and pHKMlS (Figures 1 1 
and 12), shared considerable homology with DNA sequence of the C. tropicalis 20336 
CYP52A2 and CYP52A8 genes, respectively. The complete DNA (SEQ ID NO: 87) including 
regulatory and protein coding regions (coding region defined by nucleotides 1072-2640) and 
putative amino acid sequence (SEQ ID NO: 97) of the CYP52 gene present in pHKM14 
suggested that it is CYP52A2B (Figures 15 and 16). The complete DNA (SEQ ID NO: 93) 
including regulatpry and protein coding regions (coding region defined by nucleotides 1017- 
2555) and putative amino acid sequence (SEQ ED NO: 103) of the CYP52 gene present in 
pHKMlS suggested that it is CYP52A8B (Figures 15 and 16). 

EXAMPLE 14 

Identification of ClTand CPR Genes Induced by 
Selected Fatty Acid and Alkane Substrates 

Genes whose transcription is turned on by the presence of selected fatty 
acid or alkane substrates have been identified using the QC-RT-PCR assay. This assay was used 
to measure {CYP) and {CPR) gene expression in fermentor grown cultures C. tropicalis ATCC 
20962. This method invokes die isolation of total cellular RNA from cultures of C tropicalis and 
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the quantification of a specific mRNA within that sample through the design and use of sequence 
specific QC-RT-PCR primers and an RNA competitor. Quantification is achieved through the 
use of known concentrations of highly homologous competitor RNA in the QC-RT-PCR 
reactions. The resulting QC-RT-PCR amplified cDNA's are separated and quantitated through 
the use of ion pairing reverse phase HPLC. This assay was used to characterize the expression 
of CYP52 genes of C tropicalis ATCC 20962 in response to various fatty acid and alkane 
substrates. Genes which were induced were identified by the calculation of their mRNA 
concentration at various times before and after induction. Figure 18 provides an example of how 
the concentration of mRNA for CYPS2A5cm be calculated using the QC-RT-PCR assay. The 
log ratio of imknown (U) to competitor product (C) is plotted versus the concentration of 
competitor RNA present in the QC-RT-PCR reactions. The concentration of competitor which 
results in a log ratio of U/C of zero, represents the point where the unknown messenger RNA 
concentration is equal to the concentration of the competitor. Figure 18 allows for the 
calculation of the amount of ClT*i£4 J message present in 100 ng of total RNA isolated from 
cell samples taken at 0, 1, and 2 hours after the addition of Emersol® 267 in a fermentor run. 
From this analysis, it is possible to determine the concentration of the CYPS2A5mS^Ps. present 
in 100 ng of total cellular RNA. In die plot contained in Figure 18 it takes 0.46 pg of competitor 
to equal the nrunber of mRNA's of CYPS2A5 'm 100 ng of RNA isolated from cells just prior 
(time 0) to the addition of the substrate, Emersol® 267. In cell samples taken at one and two 
hours after die addition of Emersol® 267 it takes 5.5 and 8.5 pg of competitor RNA, respectively. 
This result demonstrates diat CYPS2AS{SEQ ID NOS: 90 and 91) is induced more than 18 fold 
within two hours after the addition of Emersol® 267. This type of analysis was used to 
demonstrate diat CYP52A5 (SEQ ID NO: 90 and 91) is induced by Emersol® 267. Figure 19 
shows the relative amounts of CYP52A5 (SEQ ID NOS: 90 and 91) expression in fermentor 
runs with and without Emersol® 267 as a substrate. The differences in the CYP52A5 (SEQ. ID 
NOS: 90 and 91) expression patterns are due to the addition of Emersol® 267 to the 
fermentation medium. 

This analysis clearly demonstrates that expression of CYPS2AS(SEQ ID NOS: 90 
and 91) in C. tropicalis 20962 is inducible by the addition of Emersol® 267 to the growth 
medium. This analysis was performed to characterize tlae expression of CYPJ2A2A (SEQ ID 
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NO: 86) , CYPJ2A3AB (SEQ ID NOS: 88 and 89) , CYPJ2A8A (SEQ ID NO: 92) , CYPS2AJA 
(SEQ ID NO: 85), CYPS2D4A (SEQ ID NO: 94) and CPRB (SEQ ED NO: 82) in response to 
the presence of Emersol® 267 in the fermentation medium (Figure 20). The results of these 
analysis' indicate, that Kke die CYPJ2AS gtne (SEQ ID NOS: 90 and 91) of C. tropicaUs 20962, 
die CYPJ2A2A gene (SEQ ID NO: 86) is inducible by Emersol® 267. A small induction is 
observed for CYPS2A1A (SEQ ID NO: 85) and CYPS2A8A (SEQ ID NO: 92). In contrast, any 
induction for CYPS2D4A (SEQ ID NO: 94), CYPS2A3A (SEQ ID NO: 88), CYPS2A3B (SBQ 
m NO: 89) is below die level of detection of die assay. CPRB (SBQ ID NO: 82) is moderately 
induced by Emersol® 267, four to five fold. The results of these analysis are summarized in 
Figure 20. Figure 34 provides an example of selective induction of CYPS2A genes. When pure 
fatty acid or alkanes are spiked into a fermentor containing C. tropicaUs 20962 or a derivative 
diereof, die transcriptional activation of CYPS2A genes was detected using the QC-RT-PCR 
assay. Figure 34 shows diat pure oleic acid (ClBil) sfrongly induces CYPS2A2A (SEQ ID NO: 
86) while inducing CYPS2AS{SBCl ID NOS: 90 and 91). In die same fermentor addition of 
pure alkane (tridecane) shows strong induction of bodi CYP52A2A (SEQ ID NO: 86) and 
CYP52A1A (SEQ ID NO: 85). However, tridecane did not induce CYP52A5(^BQ ID NOS: 90 

c 

and 91) . In a separate fermentation using ATCC 20962, containing pure octadecane as the 
substrate, induction of CYP52A2A, CYP52A5A and CYP52A1A is detected (see Figure 36). The 
foregoing demonstrates selective induction of particular CIjP genes by specific substrates, thus 
pro\iding techniques for selective metabolic engineering of cell strains. For example, if tridecane 
modification is desired, organisms engineered for high levels of CYP52A2A (SEQ ID NO: 86) 
and CYP52AIA (SEQ ID NO: 85) activity are indicated. If oleic acid modification is desired, 
organisms engineered for high levels of CYP52A2A (SEQ ID NO: 86) activity are indicated. 
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EXAMPLE 15 

Integration of Selected CYP and CPR Genes 
into the Genome of Candida tropicalis 

In order to integrate selected genes into the chromosome of C tropicalis 20336 
or its descendants, there has to be a target DNA sequence, which may or may not be an intact 
gene, into which the genes can be inserted. There must also be a method to select for the 
integration event. In some cases the target DNA sequence and the selectable marker are the 
same and, if so, then there must also be a method to regain use of the target gene as a selectable 
marker following the integration event. In C tropicalis and its descendants, one gene which fits 
these criteria is URA3A, encoding orotidine-5 '-phosphate decarboxylase. Using it as a target for 
integration, ura' variants of C. tropicalis can be transformed in such a way as to regenerate a 
URA^ genotype via homologous recombination (Figure 21). Depending upon the design of the 
integration vector, one or more genes can be integrated into the genome at the same time. Using 
a split URA2A gene oriented as shown in Figure 22, homologous integration would yield at least 
one copy of the gene(s) of interest which are inserted between the split portions of the URA3A 
gene. Moreover, because of the high sequence similarity between URA3A and URA3B genes, 
integration of the construct can occur at both the URA3A and URA3B loci. Subsequently, an 
oligonucleotide designed with a deletion in a portion of the URA gene based on the identical 
sequence across both the URA3A and URA3B genes, can be utilized to yield C. tropicalis 
transformants which are once again ura' but which still carry one or more newly integrated 
genes of choice (Figure 21). ura' variants of C. tropicalis can also be isolated via other 
methods such as classical mutagenesis or by spontaneous mutation. Using well established 
protocols, selection of ura' strains can be facilitated by the use of 5-fluoroorotic acid (5-FOA) 
as described, e.g., in Boeke et al., Mol Gen. Genet, 197:345-346, (1984), incorporated herein by 
reference. The utility of this approach for the manipulation of C tropicalis has been well 
documented as described, e.g., in Picataggio et al., Mol and Cell Biol 1 1:4333-4339 (1991); 
Rohrer et al, Appl Microbiol Biotechnol 36:650-654 (1992); Picataggio et al., Bio/Technology 
10:894-898 (1992); U.S. Patent No. 5,648,247; U.S. Patent No. 5,620,878; U.S. Patent No. 
5,204,252; U.S. Patent No. 5,254,466, all of which are incorporated herein hy reference. 
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A. Construction of a URA Integration Vector, pURAin, 

Primers were designed and synthesized based on the 1712 bp sequence of the 
URA3A gene of C. tropicalis 20336 (see Figure 23). The nucleotide sequence of the URA3A 
gene of C tropicalis 20336 is set forth in SEQ E) NO: 105 and the amino acid sequence of the 
encoded protein is set forth in SEQ ID NO: 106. UJRA3A Primer Set #la (SEQ ID NO: 9) and 
#lb (SEQ ID NO: 10) (Table 4) was used in PGR with C tropicalis 20336 genomic DNA to 
amplify URA3A sequences between nucleotide 733 and 1688 as shown in Figure 23. The 
primers are designed to introduce unique 5' Asd and 3' Pad restriction sites into die resulting 
amplified URA3A fragment Asd and Pad sites were chosen because these sites are not present 
widiin CYP or CPP gents identified to date. URA3A Primer Set #2 was used in PGR with C 
tropicalis 20336 genomic DNA as a template, to amplify URA3A sequences between nucleotide 
9 and 758 as shown in Figure 23. URA3A Primer set #2a (SEQ ID NO: 1 1) and #2b (SEQ ID 
NO: 12) (Table 4) was designed to introduce unique 5 Pad and 3' Pmd restriction sites into the 
resulting amplified URA3A firagment The Pmel site is also not present uithin CIT and CPR 
genes identified to date. PGR firagments of the URA3A gene were purified, restricted with Asd, 
Pad and Pmel restriction enzymes and ligated to a gel purified, QiaexII cleaned Asd-Pmd 
digest of plasmid pNEB193 (Figure 25) purchased from New England Biolabs (Beverly, MA). 
The ligation was performed with an equimolar number of DNA termini at 16 °C for 16 hr using 
14 DNA ligase (New England Biolabs). Ligations were transformed into £ co/iXLl-Blue cells 
(Stratagene, L^JoUa, CA) according to manufacturers recommendations. White colonies were 
isolated, grown, plasmid DNA isolated and digested with Asd-PmA to confirm insertion of the 
modified URA3A into pNEB193, The resulting base integration vector was named pURAin 
(Figure 24). 
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B. Amplification of CYP52A2A, CYP52A3A, CYPJ2AJA and 
CFRB hom C. tropicaUs 20336 Genomic DNA 

The genes encoding CYPJ2A2A, (SEQ ID NO: 86) and CYPS2A3A (SEQ ID 
NO: 88) from C. tropicalis 20336 were amplified fi^om genomic clones (pPA15 and pPA57, 
respectively) (Figures 26 and 29) via PGR using primers (Primer Cri'2A#l, SEQ ID NO: 1 and 
Primer CrP2A#2, SEQ ID NO: 2 for CYP52A2A) (Primer CFi'3A#l, SEQ ID NO: 3 and 
Primer 6'ri'3A#2, SEQ ID NO: 4 for CYP52A3A) to introduce Pad cloning sites. These PGR 
primers were designed based upon the DNA sequence determined for CYP52A2A (SEQ ID NO: 
86) (Figure 15). The Amplira^ Gold PGR kit (Perkin Elmer Getus, Foster City, CA) was used 
according to manufacturers specifications. The CYP52A2A PGR amplification product was 2,230 
base pairs in length , yielding 496 bp of DNA upstream of the CYPS2Al2A start codon and 168 
bp downstream of the stop codon for the CYP52A2A ORF. The CYPS2A3A PGR amplification 
product was 2154 base pairs in length, yielding 437bp of DNA upstream of the CYP52A3A start 
codon and 97bp downstream of the stop codon for the CYP52A3A ORF. The CYPS2A3A PGR 
amplification product was 2154 base pairs in length, yielding 437bp of DNA upstream of the 
CYPJ2A3A start codon and 97bp downsteam of die stop codon for the CYP52A3A ORF. 

The gene encoding CYP52A5A (SEQ ID NO: 90) fi-om C. tropicalis 20336 was 
amphfied fi-om genomic DNA via PGR using primers (Primer CYP 5A#1, SEQ ID NO: 5 and 
Primer CYP 5A#2, SEQ ID NO: 6) to introduce Pad cloning sites. These PGR primers were 
designed based upon the DNA sequence determined for CYP52A5A (SEQ ID NO: 90) . The 
Expand Hi-Fi Taq PGR kit (Boehringer Mannheim, Indianapohs, IN) was used according to 
manufacturers specifilcations. The CYP52A5A PGR amplification product was 3,298 base pairs 
in length. 

The gene encoding CPRB (SEQ ID NO: 82) from C. tropicalis 20336 was 
amphfied fi-om genomic DNA via PGR using primers {CPR B#l, SEQ ID NO: 7 and CPR B#2, 
SEQ ID NO: 8) based upon the DNA sequence determined for CPRB (SEQ ID NO: 82) (Figure 
13). These primers were designed to introduce unique /'ad cloning sites. The Expand Hi-Fi 
Taq PGR kit (Boehringer Mannheim, Indianapolis, IN) was used according to manufacturers 
specifications. The CPRB FCR product was 3266 bp in length, yielding 747 bp pf DNA 
upstream of the CPRB start codon and 493 bp downstream of die stop codon for the CPRB 
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ORF. The resulting PGR products were isolated via agarose gel electrophoresis, purified using 
QiaexII and digested with Pad. The PGR fragments were purified, desalted and concentrated 
using a Microcon 100 (Amicon, Beverly, MA). 

The above described amplification procedures are applicable to the other genes 
listed in Table 5 using the respectively indicated primers. 

C. Gloning of CFPand CPR Genes into pURAin. 

The next step was to clone the selected ClPand CPi? genes into the pURAin 
integration vector. In a preferred aspect of the present invention, no foreign DNA other than 
that specifically provided by synthetic restriction site sequences are incorporated into the DNA 
which was cloned into the genome of C tropicalis, i.e., with the exception of restriction site DNA 
only native C tropkaJis DNA sequences are incorporated into the genome. pUElAin was 
digested widi jPad, Qiaex II cleaned, and dephosphorylated with Shrimp Alkaline Phosphatase 
(SAP) (United States Biochemical, Gleveland, OH) according the manufacturer s 
recommendations. Approximately 500 ng of Pad linearized pURAin was dephosphorylated for 1 
hr at 37 °G using SAP at a concentration of 0.2 Units of enzyme per 1 pmol of DNA termini. 
The reaction v^as stopped by heat inactivation at 65 °G for 20 min. ^ 

The CYPS2A2A Pad fragment derived using the primer shown in Table 4 was 
ligated to plasmid pURAin which had also been digested widi Pad. Pad digested pURAin was 
dephosphorylated, and ligated to the CYPJ2A2A ULTMA PGR product as described prenously. 
The ligation mixture was transformed into E, coli XLl Blue MRF (Stratagene) and 2 resistant 
colonies were selected and screened for correct constructs which should contain vector sequence, 
the inverted URAM gene, and tfie amplified CYPS2A2A gene (SEQ ID NO: 86) of 20336. 
Ascl-Pwel digestion identified one of the two constructs, plasmid pURA2in, as being correct 
(Figure 27). This plasmid was sequenced and compared to CYPS2A2A (SEQ ID NO: 86) to 
confirm that PGR did not introduce DNA base changes that would result in an amino acid 
change. 

Prior to its use, the CPRB Pad fragment derived using the primers sho\vn in 
Table 4 was sequenced and compaied to CPRB (SEQ ID NO: 82) to confirm that PGR did not 
inti-oduce DNA base pair changes tiiat would result in an amino acid change. Following 
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confirmation, CPRB (SEQ ID NO: 82) was ligated to plasmid pURAin which had also been 
digested with Pad. Pad digested pURAin was dephosphorylated, and ligated to the CPi? Expand 
Hi-Fi PGR product as described previously. The ligation mixture was transformed into E, coli 
XLl Blue MRF' (Stratagene) and several resistant colonies were selected and screened for correct 
constructs which should contain vector sequence, die inverted URA3A gene, and the amplified 
CPRB gtnt (SEQ ID NO: 82) of 20336. Ascl-Pjrwel digestion confirmed a successful construct, 
pURAREDBin. 

In a manner similar to the above, each of the other CYPsnd CPif genes disclosed 
herein are cloned into pURAin, Pad fragments of these genes, whose sequences are given in 
Figures 13 and 15, are derivable by methods known to those skilled in the art. 

1) Construction of Vectors Used to Generate HDC 20 and HDC 23 

A previously constructed integration vector containing CPRB (SEQ ID NO: 82), 
pURAREDBin, was chosen as the starting vector. This vector was partially digested with Pad 
and the linearized fragment was gel-isolated. The active Pad was destroyed by treatment with 
T4 DNA polymerase and the vector was re-ligated. Subsequent isolation and complete 
digestion of this new plasmid yielded a vector now containing only one active Pad site. This 
fragment was gel-isolated, dephosphorylated and ligated to the CYP52A2A Pad fragment. 
Vectors that contain the CYP52A2A (SEQ ID NO: 86) and CPRB (SEQ ID NO: 82) genes 
oriented in the same direction, pURAin CPR 2A S, as well as opposite directions (5' ends 
connected), pURAin CPR 2A O, were generated. 

D. Confirmation of CKP Integration (Figure 21 for Integration Scheme) 
into the Genome of C. tropicalis 

Based on the construct, pURA2in, used to transform H5343 ura, a scheme to 
detect integration was devised. Genomic DNA from transformants was digested with Dra III and 
Spe I which are enzymes that cut mthin the URA3A, and URA3B genes but not widiin the 
integrated CYPS2A2A gene. Digestion of genomic DNA where an integration had occurred at 
the URA3A or URA3B loci would be expected to result in a 3.5 kb or a 3.3 kb fragment, 
respectively (Figure 28). Moreover, digestion of die same genomic DNA with Pad would yield a 
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2.2 kb fragment characteristic for the integrated CYPS2A2A gene (Figure 28). Southern 
hybridizations of these digests widi fragments of die CYPS2A2A gene were used to screen for 
these integration events. Intensity of the band signal from the Southern using Pad digestion was 
used as a measure of the number of integration events, ((i.e, the more copies of the CYPS2A2A 
gene (SEQ ID NO: 86) which are present, the stronger the hybridization signal)). 

C tropjcalisHSMB transformed UPA prototrophs were grown at 30 °C, 170 rpm, 
in 10 ml SC-uracil media for preparation of genomic DNA. Genomic DNA was isolated by the 
method described previously. Genomic DNA was digested with Spel and iJralll. A 0.95% 
agarose gel was used to prepare a Southern hybridization blot. The DNA from the gel was 
transferred to a MagnaCharge nylon filter membrane (MSI Technologies, Westboro, MA) 
according to die alkaline transfer method of Sambrook et al., supra. For the Southern 
hybridization, a 2.2 kb CYPS2A2A DNA fragment was used as a hybridization probe. 300 ng of 
CYPJ2A2A DNA was labeled using a ECL Direct labeling and detection system (Amersham) and 
the Southern was processed according to the ECL kit specifications. The blot was processed in a 
volume of 30 ml of hybridization fluid corresponding to 0.125 ml/cm^ Following a 
prehybridization at 42''C for 1 hr, 300 ng of CYPS2A2A probe was added and die hybridization 
coirtinued for 16 hr at 42 °C. Following hybridization, die blots were washed two times for 20 min 
each at 42 "^C in primary wash containing urea. Two 5 min secondary washes at RT were 
conducted, followed by detection according to directions. The blots were exposed for 16 hours 
(hr) as recommended. 

Integration was confirmed by die detection of a Spel-Dralll 3,5 kb fragment from 
die genomic DNA of die transfomiants but not witii die C tropicalis 20336 control, 
Subsequendy, a Pad digestion of die genomic DNA of die positive transformants, followed by a 
Soudiem hybridization using an CYPS2A2A gene probe, confirmed integration by die detection 
of a 2.2 kb fragment. The resulting CYPS2A2A integrated strain was named HDCl (see Table 
1). 

In a manner similar to the above, each of the genes contained in the Pad 
fragments which are described in Section 3c above were confirmed for integration into the 
genome of C tropicalis. 
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Transformants generated by transformation with the vectors, pURAin CPR 2A S 
or pURAin CPR 2 A 0, were analyzed by Southern hybridization for integration of both the 
CYP52A2A (SEQ ID NO: 86) and CPRB (SEQ ID NO: 82) genes tandemly. Three strains were 
generated in which the CYP52A2A (SEQ ID NO: 86) and CPRB (SEQ ID NO: 82) genes 
integrated are in the opposite orientation (HDC 20-1, HDC 20-2 and HDC 20-3) and three were 
generated with the CYP52A2A (SEQ ID NO: 86) and CPRB (SEQ ID NO: 82) genes integrated 
in the same orientation (HDC 23-1, HDC 23-2 and HDC 23-3), Table 1. 

E, Confirmation of CPRB Integration into H5343 ura 

Seven transformants were screened by colony PGR using CPRB primer #2 (SEQ 
ID NO: 8) and a URA3A- specific primer. In five of the transformants, successful integration was 
detected by the presence of a 3899 bp PGR product. This 3899 bp PGR product represents the 
CPRB gtnc adjacent to the URA3A gene in the genome of H5343 thereby confirming 
integration* The resulting CPi?^ integrated strains were named HDClO-1 and HDClO-2 (see 
Table 1). 

F, Strain Evaluation. 

As determined by quantitative PGR, when compared to parent H5343, HDC 10-1 
contained three additional copies of the reductase gene and HDClO-2 contained four additional 
copies of the reductase gene. Evaluations of HDC20-1, HDG20-2 and HDG20-3 based on 
Southern hybridization data indicates that HDC20-1 contained multiple integrations, i.e., 2 to 3 
times that of HDG20-2 or HDG20-3. Evaluations of HDC23-1, HDC23-2, and HDG23-3 based 
on Southern hybridization data indicates that HDG23-3 contained multiple integrations, i.e., 2 to 
3 times that of HpG23-l or HDC23-2. The data in Table 8 indicates that the integration of 
components of the co-hydroxylase complex have a positive effect on the improvement of 
Candida tropicalis ATCC 20962 as a biocatalyst. The results indicate that CYP52A5A (SEQ 
ID NO: 90) is an important gene for the conversion of oleic acid to diacid. Surprisingly, tandem 
integrations of CYP and CPR genes oriented in the opposite direction (HDC 20 strains) seem to 
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be less productive than tandem integrations oriented in the same direction (HDC 23 strains), 
Tables 1 and 8. 
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Table 9 



Media Composition 

LB Broth 
Bacto Tryptone 
Bacto Yeast Extract 
Sodium Chloride 
Distilled Water 

LB Agar 
Bacto Tryptone 
Bacto Yeast Extract 
Sodium Chloride 
Agar 

Distilled Water 

LB Top Agarose 
Bacto Tryptone 
Bacto Yeast Extract 
Sodium Chloride 
Agarose 
Distilled Water 

NZCYM Broth 

Bacto Casein Digest 1 0 g 

Bacto Casamino Acids 1 g 

Bacto Yeast Extract ' 5 g 

Sodium Chloride 5 g 
Magnesium Sulfate 0.98 g 
(anhydrous) 

Distilled Water 1,000 ml 

NZCYM Agar 

Bacto Casein Digest 1 0 g 

Bacto Casamino Acids 1 g 

Bacto Yeast Extract 5 g 

Sodium Chloride 5 g 



10 g 
5g 
lOg 
1,000 ml 



lOg 

5g 
10 g 
15g 
1,000 ml 



lOg 
5g 

lOg 
7g 
1,000 ml 



Magnesium Sulfate 0.98 g 
(anhydrous) 

Agar ' 15 g 

Distilled Water 1,000 ml 

NZCYM Top Agarose 
Bacto Casein Digest 1 0 g 

Bacto Casamino Acids 1 g 

Bacto Yeast Extract 5 g 

Sodium Chloride 5 g 

Magnesium Sulfate 0,98 g 
(anhydrous) 

Agarose 7 g 

Distilled Water 1,000 ml 

YEPD Broth 

Bacto Yeast Extract 1 0 g 

Bacto Peptone 20 g 

Glucose ^ 20 g 

Distilled Water 1,000 ml 

YEPD Agar* 

Bacto Yeast Extract 1 0 g 

Bacto Peptone 20 g 

Glucose 20 g 

Agar 20 g 

Distilled Water 1,000 ml 

SC - uracil* 

Bacto-yeast nitrogen base without amino acids 6.7g 
Glucose 20g 

Bacto-agar 20g 
Drop-out mix 2g 

Distilled water 1,000ml 
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DCA2 medium g/I 

Peptone 3.0 

Yeast Extract 6.0 

Sodium Acetate 3.0 

Yeast Nitrogen Base (Difco) 6.7 

Glucose (anhydrous) 50.0 

Potassium Phosphate (dibasic, trihydrate) 7.2 
Potassium Phosphate (monobasic, anhydrous) 9.3 

DCA3 medium g/1 
0.3 M Phosphate buffer containing, pH 7.5 
Glycerol 50 
Yeast Nitrogen base (Difco) 6 . 7 



Drop-out mix 








Adenine 


0.5g 


Alanine 


2g 


Arginine 


2r 


Asparagine 


2g 


Aspartic acid 




Cysteine 


2g 


Glutamine 


2g 


Glutamic acid 


2g 


Glycine 


2g c 


Histidine 


2g 


Inositol 


2g 


Isoleucine 


2g 


Leucine 


lOg 


Lysine 


2g 


Methionine 


2g 


para-Aminobenzoic acid 


0.2g 


Phenylalanine 


2g 


Proline 


2g 


Serine 


2g 


Threonine 


2g 


Tryptophan 


2g 


Tyrosine 


2g 


Valine 


2g 







*See Kaiser et al., Metjiods in Yeast Genetics, Cold Spring Harbor Laboratory Press, USA (1994), incorporated herein by 
reference. 
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It will be understood that various modifications may be made to the 
embodiments and/or examples disclosed herein. Thus, the above description should not be 
construed as limiting, but merely as exemplifications of preferred embodiments. Those skilled 
in the art will envision other modifications within the scope and spirit of the claims appended 
hereto. 
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SEQUENCE LISTING 
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TROPICALIS AND METHODS RELATING THERETO 

<130> 1010-16 

<140> US 09/302,602 
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<160> 118 

<170> Patentin version 3.1 

<210> 1 

<211> 32 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 1 

ccttaattaa atgcacgaag cggagataaa ag 32 



<210> 2 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 2 

ccttaattaa gcataagctt gctcgagtct 30 



<210> 3 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 3 

ccttaattaa acgcaatggg aacatggagt g 31 
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<210> 4 

<211> 34 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer . 

<400> 4 

ccttaattaa tcgcactacg gttattggta tcag 



<210> 5 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 5 

ccttaattaa tcaaagtacg ttcaggcgg 



<210> 6 

<211> 34 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 6 

ccttaattaa ggcagacaac aacttggcaa agtc 



<210> 7 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 7 

ccttaattaa gaggtcgttg gttgagtttt c 



<210> 8 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 8 

ccttaattaa ttgataatga cgttgcggg 
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<210> 9 

<211> 33 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 9 

aggcgcgccg gagtccaaaa agaccaacct ctg 

<210> 10 

<211> 34 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 10 

ccttaattaa tacgtggata ccttcaagca agtg 



<210> 11 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 11 

ccttaattaa gctcacgagt tttgggattt tcgag 



<210> 12 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 12 

gggtttaaac cgcagaggtt ggtctttttg gactc 



<210> 13 

<211> 10 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 13 
gggtttaaac 
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<210> 14 

<211> 9 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 14 
aggcgcgcc 



<210> 15 

<211> 10 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 15 

ccttaattaa ^ 10 



<210> 16 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<220> 

<221> misc_feature 

<222> (3).. (4) 

<223> y=dCTP or dTTP 



<220> 

<221> misc_feature 

<222> (9).. (10) 

<223> w=dATP ^or dTTP 



<220> 

<221> misc_feature 

<222> (15).. (16) 

<223> w^dATP or dTTP 



<220> 

<221> misc_feature 

<222> (18)., (19) 

<223> w=dATP or dTTP 



<400> 16 

tcycaaacwg gtacwgcwga a 21 
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<210> 17 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<220> 

<221> misc_feature 

<222> (12).. (13) 

<223> y=dCTP or dTTP 



<220> 

<221> misc_feature 

<222> (15).. (16) 

<223> w=dATP or dTTP 



<400> 17 

ggtttgggta aytcwactta t 21 

<210> 18 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer = 

<400> 18 

cgttattatc atttcttc 18 



<210> 19 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<220> 

<221> raisc__f eature 

<222> (3) . . (4) 

<223> m=dATP or dCTP 



<220> 

<221> inisc_f eature 

<222> (9).. (10) 

<223> r=dATP or dGTP 



<400> 19 

gcmacaccrg tacctggacc 20 
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<210> 20 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 20 

atcccaatcg taatcagc 



<210> 21 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 21 

acttgtcttc gtttagca 



<210> 22 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of ArlSificial Sequence: Primer 

<400> 22 

ctacgtctgt ggtgatgc 



<210> 23 

<211> 17 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<220> 

<221> misc_feature 

<222> (3) . . (4) 

<223> n=dATP or dCTP or dGTP or dTTP 



<220> 

<221> misc_feature 

<222> (6).. (7) 

<223> Y=dCTP or dTTP 



<220> 

<221> misc feature 
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<222> (9).. (10) 

<223> n=dATP or dCTP or dGTP or dTTP 



<220> 
<221> 
<222> 
<223> 



misc_feature 
(12). .(13) 

n=dATP or dCTP or dGTP or dTTP 



<220> 
<221> 
<222> 
<223> 



misc_f eature 
(15).. (16) 

n=dATP or dCTP or dGTP or dTTP 



<400> 23 

cgngayacna cngcngg 



17 



<210> 
<211> 
<212> 



24 
17 
DNA 



<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<220> 

<221> misc_f eature 

<222> (3).. (4) 

<223> r=dATP or dTTP 



<220> 
<221> 
<222> 
<223> 



mi sc_f eature 
(6),. (7) 
y==dCTP or dTTP 



<220> 
<221> 
<222> 
<223> 



misc_f eature 
(9).. (10) 

n=dATP or dCTP or dGTP or dTTP 



<220> 
<221> 
<222> 
<223> 



misc_f eature 
(12) . . (13) 

n=dATP or dCTP or dGTP or dTTP 



<220> 
<221> 
<222> 
<223> 



misc_f eature 
(15) . . (16) 

n=dATP or dCTP or dGTP or dTTP 



<400> 24 

agrgayacna cngcngg 



17 
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<210> 
<211> 
<212> 



<220> 
<221> 
<222> 
<223> 



25 
17 
DNA 



<213> Artificial Sequence 
<220> 

<223> Description of Artif ileal Sequence: Primer 



misc_f eature 
(3) (4) 

n=dATP or dCTP or dGTP or dTTP 



<220> 
<221> 
<222> 
<223> 



mis cofeature 
(6).. (7) 
r=dATP or dGTP 



<220> 
<221> 
<222> 
<223> 



inisc_f eature 
(9) . . (10) 
y=dCTP or dTTP 



<220> 
<221> 
<222> 
<223> 



misc_f eature 
(12) . . (13) 
y=dCTP or dTTP 



<220> 
<221> 
<222> 
<223> 



misc_f eature 
(15) (16) 

n-dATP or dCTP or dGTP or dTTP 



<400> 25 

agngcraayt gytgncc 



17 



<210> 26 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<220> 

<221> misc_feature 

<222> (1),.(2) 

<223> y=dCTP or dTTP 



<220> 

<221> misc feature 
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<222> (4).. (5) 

<223> n-dATP or dCTP or dGTP or dTTP 



<220> 

<221> misc_feature 

<222> (7).. (8) 

<223> r=dATP or dGTP 



<220> 

<221> misc_f eature 

<222> {10)., (11) 

<223> y=dCTP or dTTP 



<220> 

<221> misc__feature 

<222> (13)., (14) 

<223> y=dCTP or dTTP 



<220> 

<221> mis cofeature 

<222> (16).. (17) 

<223> n=dATP or dCTP or dGTP or dTTP 



<400> 26 

yaangcraay tgytgncc IB 

<210> 27 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 27 

attcaacggt ggtccaagaa tctgtttgg 2 9 

<210> 28 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 28 

gagctatgtt gagaccacag tttgc 25 

<210> 29 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Primer 



<400> 29 

cttcagttaa agcaaattgt ttggcc 



26 



<210> 30 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 30 

ctcgggaagc gcgccattgt gttgg 25 



<210> 31 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 



<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<220> 

<221> misc_feature 

<222> (3).. (4) 

<223> r-dATP.or dGTP 



<220> 

<221> misc_feature 

<222> (4),. (5) 

<223> y-dCTP or dTTP 



<220> 

<221> misc_feature 

<222> (16) . . (17) 

<223> y=dCTP or dTTP 



<400> 32 

tgrytcaaac catctytctg g 21 



<400> 31 

taatacgact cactataggg cgaattggc 



29 



<210> 



32 
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<210> 33 

<211> 17 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 33 

ggaccggcgt taaaggg 



<210> 34 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 34 

catagtcgwa tyatgcttag acc 



<210> 35 

<211> 17 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 35 

ggaccaccat tgaatgg 



<210> 36 
<211> 540 
<212> DNA 

<213> Candida tropicalis 
<400> 36 

atgattgaac aactcctaga atattggtat gtcgttgtgc cagtgttgta catcatcaaa 
caactccttg catacacaaa gactcgcgtc ttgatgaaaa agttgggtgc tgctccagtc 
acaaacaagt tgtacgacaa cgctttcggt atcgtcaatg gatggaaggc tctccagttc 
aagaaagagg gcagggctca agagtacaac gattacaagt ttgaccactc caagaaccca 
agcgtgggca cctacgtcag tattcttttc ggcaccagga tcgtcgtgac caaagatcca 
gagaatatca aagctatttt ggcaacccag tttggtgatt tttctttggg caagaggcac 
actcttttta agcctttgtt aggtgatggg atcttcacat tggacggcga aggctggaag 
cacagcagag ccatgttgag accacagttt gccagagaac aagttgctca tgtgacgtcg 
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ttggaaccac acttccagtt gttgaagaag catattctta agcacaaggg tgaatacttt 540 



<210> 37 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 37 

ccgatgaagt tttcgacgag taccc 25 

<210> 38 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 38 

aaggctttaa cgtgtccaat ctggtc 26 

<210> 39 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 39 

attatcgcca catacttcac caaatgg 27 

<210> 40 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<4G0> 40 

cgagatcgtg gatacgctgg agtg 24 

<210> 41 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 41 
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gccactcggt aactttgtca gggac 



<210> 42 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 42 

cattgaactg agtagccaaa acagcc 

<210> 43 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 43 

cctacgtttg gtatcgctac tccgttg 



<210> 44 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 44 

tttccagcca gcaccgtcca ag 



<210> 45 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 45 

gcagagccga tctatgttgc gtcc 

<210> 46 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 46 
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tcattgaatg cttccaggaa cctcg 



<210> 47 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 47 

aagagggcag ggctcaagag 



<210> 48 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 48 

tccatgtgaa gatcccatca c 



<210> 49 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 49 

cttgaaggcc gtgttgaacg 

<210> 50 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 50 

caggatttgt ctgagttgcc g 



<210> 51 

<211> 28 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 51 
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ccattgcctt gagatacgcc attggtag 



28 



<210> 52 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 52 

agccttggtg tcgttctttt caacgg 26 

<210> 53 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 53 

ttgggtttgt ttgtttcctg tgtccg ^ 26 

<210> 54 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> c 

<223> Description of Artificial Sequence: Primer 

<400> 54 

cctttgacct tcaatctggc gtagacg 27 

<210> 55 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 55 

gtttgctgaa tacgctgaag gtgatg 26 

<210> 56 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 56 
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tggagctgaa caactctctc gtctcgg 



<210> 57 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 57 

ttcctcaaca cggacagcgg 



<210> 58 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 58 

agtcaaccag gtgtggaact cgtc 



<210> 59 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 59 

ggatcctaat acgactcact atagggagga agagggcagg gctcaagag 



<210> 60 

<211> 42 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 60 

tccatgtgaa gatcccatca cgagtgtgcc tcttgcccaa ag 



<210> 61 

<211> 54 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 61 
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ggatcctaat acgactcact atagggaggc cgatgaagtt ttcgacgagt accc 



<210> 62 

<211> 52 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 62 

aaggctttaa cgtgtccaat ctggtcaaca tagctctgga gtgcttccaa cc 



<210> 63 

<211> 56 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 63 

ggatcctaat acgactcact atagggagga ttatcgccac atacttcacc aaatgg 



<210> 64 

<211> 52 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 64 

cgagatcgtg gatacgctgg agtgcgtcgc tcttcttctt caacaattca ag 



<210> 65 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Descrij^tion of Artificial Sequence: Primer 

<400> 65 

cattgaactg agtagccaaa acagcccatg gtttcaatca atgggaggc 



<210> 66 

<211> 54 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 66 
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ggatcctaat acgactcact atagggaggg ccactcggta actttgtcag ggac 



<210> 67 

<211> 56 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 67 

ggatcctaat acgactcact atagggaggc ctacgtttgg tatcgctact ccgttg 



<210> 68 

<211> 48 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 68 

tttccagcca gcaccgtcca agcaacaagg agtacaagaa atcgtgtc 



<210> 69 

<211> 53 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 69 

ggatcctaat acgactcact atagggaggg cagagccgat ctatgttgcg tec 



<210> 70 

<211> 45 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 70 

tcattgaatg cttccaggaa cctcgccaca tccatcgaga accgg 



<210> 71 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 71 
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ggatcctaat acgactcact atagggaggc ttgaaggccg tgttgaacg 



<210> 72 

<211> 46 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 72 

caggatttgt ctgagttgcc gcctgatcaa gataggatcc ttgccg 



<210> 73 

<211> 56 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 73 

ggatcctaat acgactcact atagggaggg gtttgctgaa tacgctgaag gtgatg 



<210> 74 

<211> 52 

<212> DNA 

<213> Artificial Sequence 
<220> ^ 

<223> Description of Artificial Sequence: Primer 

<400> 74 

tggagctgaa caactctctc gtctcgggtg gtcgaatgga cccttggtca ag 



<210> 75 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 75 

ggatcctaat acgactcact atagggaggt tcctcaacac ggacagcgg 

<210> 76 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 76 
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agtcaaccag gtgtggaact cgtcggtggc aacaatgaaa aacaccaag 



49 



<210> 77 

<211> 57 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 77 

ggatcctaat acgactcact atagggaggc cattgccttg agatacgcca ttggtag 57 



<210> 78 

<211> 53 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 78 

agccttggtg tcgttctttt caacggaagg tggtctcgat ggtgtgttca acc 53 

<210> 79 

<211> 55 

<212> DNA 

<213> Artificial Sequence 

<220> c 

<223> Description of Artificial Sequence: Primer 

<400> 79 

ggatcctaat acgactcact atagggaggt tgggtttgtt tgtttcctgt gtccg 55 



<210> 80 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 80 

cctttgacct tcaatctggc gtagacgcag caccaccgat ccaccacttg 



<210> 81 
<211> 4206 
<212> DNA 

<213> Candida tropicalis 
<400> 81 

catcaagatc atctatgggg ataattacga cagcaacatt gcagaaagag cgttggtcac 60 
aatcgaaaga gcctatggcg ttgccgtcgt tgaggcaaat gacagcacca acaataacga 120 
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tggtcccagt 


gaagagcctt 


cagaacagtc 


cattgttgac 


gcttaaggca 


cggataatta 


180 


cgtggggcaa 


aggaacgcgg 


aattagttat 


ggggggatca 


aaagcggaag 


atttgtgttg 


240 


cttgtgggtt 


ttttccttta 


tttttcatat 


gatttctttg 


cgcaagtaac 


atgtgccaat 


300 


ttagtttgtg 


attagcgtgc 


cccacaattg 


gcatcgtgga 


cgggcgtgtt 


ttgtcatacc 


360 


ccaagtctta 


actagctcca 


cagtctcgac 


ggtgtctcga 


cgatgtcttc 


ttccacccct 


420 


cccatgaatc 


attcaaagtt 


gttgggggat 


ctccaccaag 


ggcaccggag 


ttaatgctta 


480 


tgtttctccc 


actttggttg 


tgattggggt 


agtctagtga 


gttggagatt 


ttcttttttt 


540 


cgcaggtgtc 


tccgatatcg 


aaatttgatg 


aatatagaga 


gaagccagat 


cagcacagta 


600 


gattgccttt 


gtagttagag 


atgttgaaca 


gcaactagtt 


gaattacacg 


ccaccacttg 


660 


acagcaagtg 


cagtgagctg 


taaacgatgc 


agccagagtg 


tcaccaccaa 


ctgacgttgg 


720 


gtggagttgt 


tgttgttgtt 


gttggcaggg 


ccatattgct 


aaacgaagac 


aagtagcaca 


780 


aaacccaagc 


ttaagaacaa 


aaataaaaaa 


aattcatacg 


acaattccaa 


agccattgat 


840 


ttacataatc 


aacagtaaga 


cagaaaaaac 


tttcaacatt 


tcaaagttcc 


ctttttccta 


900 


ttacttcttt 


tttttcttct 


ttccttcttt 


ccttctgttt 


ttcttacttt 


atcagtcttt 


960 


tacttgtttt 


tgcaattcct 


catcctcctc 


ctactcctcc 


tcaccatggc 


tttagacaag 


1020 


ttagatttgt 


atgtcatcat 


aacattggtg 


gtcgctgtag 


ccgcctattt 


tgctaagaac 


1080 


cagttccttg 


atcagcccca 


ggacaccggg 


ttcctcaaca 


cggacagcgg 


aagcaactcc 


1140 


agagacgtct 


tgctgacatt 


gaagaagaat 


aataaaaaca 


cgttgttgtt 


gtttgggtcc 


1200 


cagacgggta 


cggcagaaga 


ttacgccaac 


aaattgtcca 


gagaattgca 


ctccagattt 


1260 


ggcttgaaaa 


cgatggttgc 


agatttcgct 


gattacgatt 


gggataactt 


cggagatatc 


1320 


accgaagaca 


tcttggtgtt 


tttcattgtt 


gccacctatg 


gtgagggtga 


acctaccgat 


1380 


aatgccgacg 


agttccacac 


ctggttgact 


gaagaagctg 


acactttgag 


taccttgaaa 


1440 


tacaccgtgt 


tcgggttggg 


taactccacg 


tacgagttct 


tcaatgccat 


tggtagaaag 


1500 


tttgacagat 


tgttgagcga 


gaaaggtggt 


gacaggtttg 


ctgaatacgc 


tgaaggtgat 


1560 


gacggtactg 


gcaccttgga 


cgaagatttc 


atggcctgga 


aggacaatgt 


ctttgacgcc 


1620 


ttgaagaatg 


atttgaactt 


tgaagaaaag 


gaattgaagt 


acgaaccaaa 


cgtgaaattg 


1680 


actgagagag 


acgacttgtc 


tgctgctgac 


tcccaagttt 


ccttgggtga 


gccaaacaag 


1740 


aagtacatca 


actccgaggg 


catcgacttg 


accaagggtc 


cattcgacca 


cacccaccca 


1800 


tacttggcca 


gaatcaccga 


gacgagagag 


ttgttcagct 


ccaaggacag 


acactgtatc 


1860 


cacgttgaat 


ttgacatttc 


tgaatcgaac 


ttgaaataca 


ccaccggtga 


ccatctagct 


1920 
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atctggccat 


ccaactccga 


cgaaaacatt 


aagcaatttg 


ccaagtgttt 


cggattggaa 


1980 


gataaactcg 


acactgttat 


tgaattgaag 


gcgttggact 


ccacttacac 


catcccattc 


2040 


ccaaccccaa 


ttacctacgg 


tgctgtcatt 


agacaccatt 


tagaaatctc 


cggtccagtc 


2100 


tcgagacaat 


tctttttgtc 


aattgctggg 


tttgctcctg 


atgaagaaac 


aaagaaggct 


2160 


tttaccagac 


ttggtggtga 


caagcaagaa 


ttcgccgcca 


aggtcacccg 


cagaaagttc 


2220 


aacattgccg 


atgccttgtt 


atattcctcc 


aacaacgctc 


catggtccga 


tgttcctttt 


2280 


gaattcctta 


ttgaaaacgt 


tccacacttg 


actccacgtt 


actactccat 


ttcgtcttcg 


2340 


tcattgagtg 


aaaagcaact 


catcaacgtt 


actgcagttg 


ttgaagccga 


agaagaagct 


2400 


gatggcagac 


cagtcactgg 


tgttgtcacc 


aacttgttga 


agaacgttga 


aattgtgcaa 


2460 


aacaagactg 


gcgaaaagcc 


acttgtccac 


tacgatttga 


gcggcccaag 


aggcaagttc 


2520 


aacaagttca 


agttgccagt 


gcatgtgaga 


agatccaact 


ttaagttgcc 


aaagaactcc 


2580 


accaccccag 


ttatcttgat 


tggtccaggt 


actggtgttg 


ccccattgag 


aggttttgtc 


2640 


agagaaagag 


ttcaacaagt 


caagaatggt 


gtcaatgttg 


gcaagacttt 


gttgttttat 


2700 


ggttgcagaa 


actccaacga 


ggactttttg 


tacaagcaag 


aatgggccga 


gtacgcttct 


2760 


gttttgggtg 


aaaactttga 


gatgttcaat 


gccttctcca 


gacaagaccc 


atccaagaag 


2820 


gtttacgtcc 


aggataagat 


tttagaaaac 


agccaacttg 


tgcacgagtt 


gttgactgaa 


2880 


ggtgccatta 


tctacgtctg 


tggtgatgcc 


agtagaatgg 


ctagagacgt 


gcagaccaca 


2940 


atttccaaga 


ttgttgctaa 


aagcagagaa 


attagtgaag 


acaaggctgc 


tgaattggtc 


3000 


aagtcctgga 


aggtccaaaa 


tagataccaa 


gaagatgttt 


ggtagactca 


aacgaatctc 


3060 


tctttctccc 


aacgcattta 


tgaatcttta 


ttctcattga 


agctttacat 


atgttctaca 


3120 


ctttattttt 


tttttttttt 


ttattattat 


attacgaaac 


ataggtcaac 


tatatatact 


3180 


tgattaaatg 


ttatagaaac 


aataactatt 


atctactcgt 


ctacttcttt 


ggcattgaca 


3240 


tcaacattac 


cgttcccatt 


accgttgccg 


ttggcaatgc 


cgggatattt 


agtacagtat 


3300 


ctccaatccg 


gatttgagct 


attgtagatc 


agctgcaagt 


cattctccac 


cttcaaccag 


3360 


tacttatact 


tcatctttga 


cttcaagtcc 


aagtcataaa 


tattacaagt 


tagcaagaac 


3420 


ttctggccat 


ccacgatata 


gacgttattc 


acgttattat 


gcgacgtatg 


gatgtggtta 


3480 


tccttattga 


acttctcaaa 


cttcaaaaac 


aaccccacgt 


cccgcaacgt 


cattatcaac 


3540 


gacaagttct 


ggctcacgtc 


gtcggagctc 


gtcaagttct 


caattagatc 


gttcttgtta 


3600 


ttgatcttct 


ggtactttct 


caattgctgg 


aacacattgt 


cctcgttgtt 


caaatagatc 


3660 


ttgaacaact 


ttttcaacgg 


gatcaacttc 


tcaatctggg 


ccaagatctc 


cgccgggatc 


3720 
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ttcagaaaca 


agtcctgcaa 


cccctggtcg 


atggtctccg 


ggtacaacaa 


gtccaagggg 


3780 


cagaagtgtc 


taggcacgtg 


tttcaactgg 


ttcaacgaac 


atgttcgaca 


gtagttcgag 


3840 


ttatagttat 


cgtacaacca 


ttttggtttg 


atttcgaaaa 


tgacggagct 


gatgccatca 


3900 


ttctcctggt 


tcctctcata 


gtacaactgg 


cacttcttcg 


agaggctcaa 


ttcctcgtag 


3960 


ttcccgtcca 


agatattcgg 


caacaagagc 


ccgtaccgct 


cacggagcat 


caagtcgtgg 


4020 


ccctggttgt 


tcaacttgtt 


gatgaagtcc 


gaggtcaaga 


caatcaactg 


gatgtcgatg 


4080 


atctggtgcg 


ggaacaagtt 


cttgcatttt 


agctcgatga 


agtcgtacaa 


ctcacacgtc 


4140 


gagatatact 


cctgttcctc 


cttcaagagc 


cggatccgca 


agagcttgtg 


cttcaagtag 


4200 


tcgttg 












4206 



<210> 82 

<211> 4145 

<212> DNA 

<213> Candida tropicalis 

<400> 82 



tatatgatat 


atgatatatc 


ttcctgtgta 


attattattc 


gtattcgtta 


atacttacta 


60 


catttttttt 


tctttattta 


tgaagaaaag 


gagagttcgt 


aagttgagtt 


gagtagaata 


120 


ggctgttgtg 


catacgggga 


gcagaggaga 


gtatccgacg 


aggaggaact 


gggtgaaatt 


180 


tcatctatgc 


tgttgcgtcc 


tgtactgtac 


tgtaaatctt 


agatttccta 


gaggttgttc 


240 


tagcaaataa 


agtgtttcaa 


gatacaattt 


tacaggcaag 


ggtaaaggat 


caactgatta 


300 


gcggaagatt 


ggtgttgcct 


gtggggttct 


tttatttttc 


atatgatttc 


tttgcgcgag 


360 


taacatgtgc 


caatctagtt 


tatgattagc 


gtacctccac 


aattggcatc 


ttggacgggc 


420 


gtgttttgtc 


ttaccccaag 


ccttatttag 


ttccacagtc 


tcgacggtgt 


ctcgccgatg 


480 


tcttctccca 


cccctcgcag 


gaatcattcg 


aagttgttgg 


gggatctcct 


ccgcagttta 


540 


tgttcatgtc 


tttcccactt 


tggttgtgat 


tggggtagcg 


tagtgagttg 


gtgattttct 


600 


tttttcgcag 


gtgtctccga 


tatcgaagtt 


tgatgaatat 


aggagccaga 


tcagcatggt 


660 


atattgcctt 


tgtagataga 


gatgttgaac 


aacaactagc 


tgaattacac 


accaccgcta 


720 


aacgatgcgc 


acagggtgtc 


accgccaact 


gacgttgggt 


ggagttgttg 


ttggcagggc 


780 


catattgcta 


aacgaagaga 


agtagcacaa 


aacccaaggt 


taagaacaat 


taaaaaaatt 


840 


catacgacaa 


ttccacagcc 


atttacataa 


tcaacagcga 


caaatgagac 


agaaaaaact 


900 


ttcaacattt 


caaagttccc 


tttttcctat 


tacttctttt 


tttctttcct 


tcctttcatt 


960 


tcctttcctt 


ctgcttttat 


tactttacca 


gtcttttgct 


tgtttttgca 


attcctcatc 


1020 
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ct cctcctca 


ccatggcttt 


agacaagtta 


gau Ltgtatg 


t catcataac 


a u ug g uy y 


1 ORO 

J. u o u 




ccua Lu L cgc 


taagaaccag 


ttccttgatc 


agccccagga 


caccyyyL ll* 


1140 

X J, 1 


ctcaacacgg 


acagcggaag 


caactccaga 


gacgtcttgc 


tgaca c rgaa 


gaagaa.Ccia.L. 


X ^ VJ u 


aaaaEcacgt 


T-gi, ugu ug u u 


tgggtcccag 


accggtacgg 


cagaagatt a 


Oyv^(-.cla.^^a.clCL 


X V 


u u g L. Ca a ga g 


aa L ugcaccc 


ca gar Luggc 


Ltgaaaacca 


x.gg LX.guay a 


L. L. L.(— yo cyci L. 


1320 


uacgat rggg 


araacurcgg 


agatatcacc 


gaagat atct 


4_ _~.4- _4_+- J- j-j- 
tggugL cT-TZ L 


caccyucyoo 


X *J o u 


acctacggtg 


agggtgaacc 


taccgacaat 


gccgacgagt 


tccacacctg 


gT-LyacLyad. 


X ri y 


gaagctgaca 


ctttgagtac 


tttgagatat 


accgtgttcg 


ggttgggtaa 




T ROO 

X o u u 


gagttcttca 


atgctattgg 


tagaaagttt 


gacagattgt 


tgagtgagaa 


aggtggtgac 




agatttgctg 


aat atgctga 


aggtgacgac 


ggcactggca 


ccttggacga 


agaT-uncaLy 




gcctggaagg 


ataa ugx-C Lu 


tgacgccttg 


aagaatgact 


ugaac c u Lga 


ciyclciciciyya.ci 


X u o u 


ttgaagtacg 


aaccaaacgt 


gaaattgact 


gagagagatg 


acttgtctgc 


t^gccy acLL-u 




caagtttcct 


tgggtgagcc 


aaacaagaag 


tacatcaact 


ccgagggcat 


cgacttgacc 


1 Q n n 
i. o uu 


aagggtccat 


tcgaccacac 


ccacccatac 


ttggccagga 


tcaccgagac 


cagagagttg 


-t Q /T n 
iobU 


ttcagctcca 


aggaaagaca 


ctgtattcac 


gttgaatttg 


acatttctga 


atcgaacttg 


1 ri o A 

1920 


aaatacacca 


ccggtgacca 


tctagccatc 


tggccatcca 


actccgacga 


aaacatcaag 


T Q Q Pi 


caatttgcca 


agtgttt egg 


^attggaagat 


aaactcgaca 


ctgttattga 


at tgaaggca 


904 0 
Z U 4 u 


ttggactcca 


cttacaccat 


tccattccca 


actccaatta 


cttacggtgc 


tgtcattaga 


Z X u u 


caccatttag 


aaatctccgg 


tccagtctcg 


agacaattct 


ttttgtcgat 


tgctgggttt 


z 1 oU 


gctcctgatg 


aagaaacaaa 


gaagactttc 


accagacttg 


gtggtgacaa 


acaagaattc 


O O O A 


gccaccaagg 


ttacccgcag 


aaagttcaac 


attgccgatg 


ccttgttata 


ttcctccaac 


o o o o 


aacactccat 


ggtccgatgt 


tccttttgag 


ttccttattg 


aaaacatcca 


acacttgact 


Z JS4 U 


ccacgttact 


actccatttc 


ttcttcgtcg 


ttgagtgaaa 


aacaactcat 


caatgttact 


z 4H)U 


gcagtcgttg 


aggccgaaga 


agaagccgat 


ggcagaccag 


tcactggtgt 


tgttaccaac 


Z4 bU 


ttgttgaaga 


acattgaaat 


tgcgcaaaac 


aagactggcg 


aaaagccact 


tgttcactac 


■o c o ri 


gatttgagcg 


gcccaagagg 


caagttcaac 


aagttcaagt 


tgccagtgca 


cgtgagaaga 


O C O A 

zboU 


tccaacttta 


agttgccaaa 


gaactccacc 


accccagtta 


tcttgattgg 


tccaggtact 


2640 


ggtgttgccG 


cattgagagg 


tttcgttaga 


gaaagagttc 


aacaagtcaa 


gaatggtgtc 


2700 


aatgttggca 


agactttgtt 


gttttatggt 


tgcagaaact 


ccaacgagga 


ctttttgtac 


2760 


aagcaagaat 


gggccgagta 


cgcttctgtt 


ttgggtgaaa 


actttgagat 


gttcaatgcc 


2820 
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ttctctagac aagacccatc caagaaggtt tacgtccagg ataagatttt agaaaacagc 2880 

caacttgtgc acgaattgtt gaccgaaggt gccattatct acgtctgtgg tgacgccagt 2940 

agaatggcca gagacgtcca gaccacgatc tccaagattg ttgccaaaag cagagaaatc 3000 

agtgaagaca aggccgctga attggtcaag tcctggaaag tccaaaatag ataccaagaa 3060 

gatgtttggt agactcaaac gaatctctct ttctcccaac gcatttatga atattctcat 3120 

tgaagtttta catatgttct atatttcatt ttttttttat tatattacga aacataggtc 3180 

aactatatat acttgattaa atgttataga aacaataatt attatctact cgtctacttc 3240 

tttggcattg gcattggcat tggcattggc attgccgttg ccgttggtaa tgccgggata 3300 

tttagtacag tatctccaat ccggatttga gctattgtaa atcagctgca agtcattctc 3360 

caccttcaac cagtacttat acttcatctt tgacttcaag tccaagtcat aaatattaca 3420 

agttagcaag aacttctggc catccacaat atagacgtta ttcacgttat tatgcgacgt 3480 

atggatatgg ttatccttat tgaacttctc aaacttcaaa aacaacccca cgtcccgcaa 3540 

cgtcattatc aacgacaagt tctgactcac gtcgtcggag ctcgtcaagt tctcaattag 3600 

atcgttcttg ttattgatct tctggtactt tctcaactgc tggaacacat tgtcctcgtt 3660 

gttcaaatag atcttgaaca acttcttcaa gggaatcaac ttttcgatct gggccaagat 3720 

ttccgccggg atcttcagaa acaagtcctg caacccctgg tcgatggtct cggggtacaa 3780 

caagtctaag gggcagaagt gtctaggcac gtgtttcaac tggttcaagg aacatgttcg 3840 

acagtagttc gagttatagt tatcgtacaa ccactttggc ttgatttcga aaatgacgga 3900 

gctgatccca tcattctcct ggttcctttc atagtacaac tggcatttct tcgagagact 3960 

caactcctcg tagttcccgt ccaagatatt cggcaacaag agcccgtagc gctcacggag 4020 

catcaagtcg tggccctggt tgttcaactt gttgatgaag tccgatgtca agacaatcaa 4080 

ctggatgtcg atgatctggt gcggaaacaa gttcttgcac tttagctcga tgaagtcgta 4140 

caact 4145 

<210> 83 
<211> 679 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 83 

Met Ala Leu Asp Lys Leu Asp Leu Tyr Val He He Thr Leu Val Val 



Ala Val Ala Ala Tyr Phe Ala Lys Asn Gin Phe Leu Asp Gin Pro Gin 
20 25 30 
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Asp Thr Gly Phe Leu Asn Thr Asp Ser Gly Ser Asn Ser Arg Asp Val 
35 40 45 



Leu Leu Thr Leu Lys Lys Asn Asn Lys Asn Thr Leu Leu Leu Phe Gly 
50 55 60 



Ser Gin Thr Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser Arg Glu 
65 70 75 80 



Leu His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp Phe Ala Asp 
85 90 95 



Tyr Asp Trp Asp Asn Phe Gly Asp lie Thr Glu Asp lie Leu Val Phe 
100 105 110 



Phe He Val Ala Thr Tyr Gly Glu Gly Glu Pro Thr Asp Asn Ala Asp 
115 120 125 



Glu Phe His Thr Trp Leu Thr Glu Glu Ala Asp Thr Leu Ser Thr Leu 
130 135 140 



Lys Tyr Thr Val Phe Gly Leu Gly Asn Ser Thr Tyr Glu Phe Phe Asn 
145 150 155 160 



Ala He Gly Ai^g Lys Phe Asp Arg Leu Leu Ser Glu Lys Gly Gly Asp 
165 170 175 



Arg Phe Ala Glu Tyr Ala Glu Gly Asp Asp Gly Thr Gly Thr Leu Asp 
180 185 190 



Glu Asp Phe Met Ala Trp Lys Asp Asn Val Phe Asp Ala Leu Lys Asn 
195 200 205 



Asp Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro Asn Val Lys 
210 215 220 



Leu Thr Glu Arg Asp Asp Leu Ser Ala Ala Asp Ser Gin Val Ser Leu 
225 230 235 240 



Gly Glu Pro Asn Lys Lys Tyr He Asn Ser Glu Gly He Asp Leu Thr 
^ 245 250 255 



Lys Gly Pro Phe Asp His Thr His Pro Tyr Leu Ala Arg He Thr Glu 
260 265 270 
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Thr Arg Glu Leu Phe Ser Ser Lys Asp Arg His Cys lie His Val Glu 
275 280 285 



Phe Asp lie Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly Asp His Leu 
290 295 300 



Ala He Trp Pro Ser Asn Ser Asp Glu Asn He Lys Gin Phe Ala Lys 
305 310 315 320 



Cys Phe Gly Leu Glu Asp Lys Leu Asp Thr Val He Glu Leu Lys Ala 
325 330 335 



Leu Asp Ser Thr Tyr Thr He Pro Phe Pro Thr Pro He Thr Tyr Gly 
340 345 350 



Ala Val He Arg His His Leu Glu He Ser Gly Pro Val Ser Arg Gin 
355 360 365 



Phe Phe Leu Ser He Ala Gly Phe Ala Pro Asp Glu Glu Thr Lys Lys 
370 375 380 



Ala Phe Thr Arg Leu Gly Gly Asp Lys Gin Glu Phe Ala Ala Lys Val 
385 390 395 400 



Thr Arg Arg Lys Phe Asn He Ala Asp Ala Leu Leu Tyr Ser Ser Asn 
405 410 415 



Asn Ala Pro Trp Ser Asp Val Pro Phe Glu Phe Leu He Glu Asn Val 
420 425 430 



Pro His Leu Thr Pro Arg Tyr Tyr Ser He Ser Ser Ser Ser Leu Ser 
435 440 445 



Glu Lys Gin Leu He Asn Val Thr Ala Val Val Glu Ala Glu Glu Glu 
450 455 460 



Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn Leu Leu Lys Asn 
465 470 475 480 



Val Glu He Val Gin Asn Lys Thr Gly Glu Lys Pro Leu Val His Tyr 
485 490 495 



Asp Leu Ser Gly Pro Arg Gly Lys Phe Asn Lys Phe Lys Leu Pro Val 
500 505 510 
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His Val Arg Arg Ser Asn Phe Lys Leu Pro Lys Asn Ser Thr Thr Pro 
515 520 525 



Val He Leu He Gly Pro Gly Thr Gly Val Ala Pro Leu Arg Gly Phe 
530 535 540 



Val Arg Glu Arg Val Gin Gin Val Lys Asn Gly Val Asn Val Gly Lys 
545 550 555 560 



Thr Leu Leu Phe Tyr Gly Cys Arg Asn Ser Asn Glu Asp Phe Leu Tyr 
565 570 575 



Lys Gin Glu Trp Ala Glu Tyr Ala Ser Val Leu Gly Glu Asn Phe Glu 
580 585 590 



Met Phe Asn Ala Phe Ser Arg Gin Asp Pro Ser Lys Lys Val Tyr Val 
595 600 605 



Gin Asp Lys He Leu Glu Asn Ser Gin Leu Val His Glu Leu Leu Thr 
610 615 620 



Glu Gly Ala He He Tyr Val Cys Gly Asp Ala Ser Arg Met Ala Arg 
625 630 635 640 



Asp Val Gin Thr Thr He Ser Lys He Val Ala Lys Ser Arg Glu He 
645 650 655 



Ser Glu Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys Val Gin Asn 
660 665 670 



Arg Tyr Gin Glu Asp Val Trp 
675 



<210> 84 
<211> 679 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 84 

Met Ala Leu Asp Lys Leu Asp Leu Tyr Val He He Thr Leu Val Val 
15 10 15 



Ala Val Ala Ala Tyr Phe Ala Lys Asn Gin Phe Leu Asp Gin Pro Gin 
20 25 30 
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Asp Thr Gly Phe Leu Asn Thr Asp Ser Gly Ser Asn Ser Arg Asp Val 
35 40 45 



Leu Leu Thr Leu Lys Lys Asn Asn Lys Asn Thr Leu Leu Leu Phe Gly 
50 55 60 



Ser Gin Thr Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser Arg Glu 
65 70 75 80 



Leu His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp Phe Ala Asp 
85 90 95 



Tyr Asp Trp Asp Asn Phe Gly Asp lie Thr Glu Asp lie Leu Val Phe 
100 105 110 



Phe He Val Ala Thr Tyr Gly Glu Gly Glu Pro Thr Asp Asn Ala Asp 
115 120 125 



Glu Phe His Thr Trp Leu Thr Glu Glu Ala Asp Thr Leu Ser Thr Leu 
130 135 140 



Arg Tyr Thr Val Phe Gly Leu Gly Asn Ser Thr Tyr Glu Phe Phe Asn 
145 150 155 160 



Ala He Gly Arg Lys Phe Asp Arg Leu Leu Ser Glu Lys Gly Gly Asp 
. 165 170 175 



Arg Phe Ala Glu Tyr Ala Glu Gly Asp Asp Gly Thr Gly Thr Leu Asp 
180 185 190 



Glu Asp Phe Met Ala Trp Lys Asp Asn Val Phe Asp Ala Leu Lys Asn 
195 200 205 



Asp Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro Asn Val Lys 
210 215 220 



Leu Thr Glu Arg Asp Asp Leu Ser Ala Ala Asp Ser Gin Val Ser Leu 
225 230 235 240 



Gly Glu Pro Asn Lys Lys Tyr He Asn Ser Glu Gly He Asp Leu Thr 
245 250 255 



Lys Gly Pro Phe Asp His Thr His Pro Tyr Leu Ala Arg He Thr Glu 
260 265 270 
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Thr Arg Glu Leu Phe Ser Ser Lys Glu Arg His Cys He His Val Glu 
275 280 285 



Phe Asp He Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly Asp His Leu 
290 295 300 



Ala He Trp Pro Ser Asn Ser Asp Glu Asn He Lys Gin Phe Ala Lys 
305 310 315 320 



Cys Phe Gly Leu Glu Asp Lys Leu Asp Thr Val He Glu Leu Lys Ala 
325 330 335 



Leu Asp Ser Thr Tyr Thr He Pro Phe Pro Thr Pro He Thr Tyr Gly 
340 345 350 



Ala Val He Arg His His Leu Glu He Ser Gly Pro Val Ser Arg Gin 
355 360 365 



Phe Phe Leu Ser He Ala Gly Phe Ala Pro Asp Glu Glu Thr Lys Lys 
370 375 380 



Thr Phe Thr Arg Leu Gly Gly Asp Lys Gin Glu Phe Ala Thr Lys Val 
385 390 395 400 



Thr Arg Arg Lys Phe Asn He Ala Asp Ala Leu Leu Tyr Ser Ser Asn 
405 410 415 



Asn Thr Pro Trp Ser Asp Val Pro Phe Glu Phe Leu He Glu Asn He 
420 425 430 



Gin His Leu Thr Pro Arg Tyr Tyr Ser He Ser Ser Ser Ser Leu Ser 
435 440 445 



Glu Lys Gin Leu He Asn Val Thr Ala Val Val Glu Ala Glu Glu Glu 
450 455 460 



Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn Leu Leu Lys Asn 
465 470 475 480 



He Glu He Ala Gin Asn Lys Thr Gly Glu Lys Pro Leu Val His Tyr 
485 490 495 



Asp Leu Ser Gly Pro Arg Gly Lys Phe Asn Lys Phe Lys Leu Pro Val 
500 505 510 
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His Val Arg Arg Ser Asn Phe Lys Leu Pro Lys Asn Ser Thr Thr Pro 
515 520 525 



Val He Leu He Gly Pro Gly Thr Gly Val Ala Pro Leu Arg Gly Phe 
530 535 540 



Val Arg Glu Arg Val Gin Gin Val Lys Asn Gly Val Asn Val Gly Lys 
545 550 555 560 



Thr Leu Leu Phe Tyr Gly Cys Arg Asn Ser Asn Glu Asp Phe Leu Tyr 
565 570 575 



Lys Gin Glu Trp Ala Glu Tyr Ala Ser Val Leu Gly Glu Asn Phe Glu 
580 585 590 



Met Phe Asn Ala Phe Ser Arg Gin Asp Pro Ser Lys Lys Val Tyr Val 
595 600 605 



Gin Asp Lys He Leu Glu Asn Ser Gin Leu Val His Glu Leu Leu Thr 
610 615 620 



Glu Gly Ala He He Tyr Val Cys Gly Asp Ala Ser Arg Met Ala Arg 
625 630 635 640 



Asp Val Gin Thr Thr He Ser Lys He Val Ala Lys Ser Arg Glu He 
645 650 655 



Ser Glu Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys Val Gin Asn 
660 665 670 



Arg Tyr Gin Glu Asp Val Trp 
675 
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<400> 85 



catatgcgct 


aafecttcttt 


ttctttttat 


cacaggagaa 


actatcccac 


ccccacttcg 


60 


aaacacaatg 


acaactcctg 


cgtaacttgc 


aaattcttgt 


ctgactaatt 


gaaaactccg 


120 


gacgagtcag 


acctccagtc 


aaacggacag 


acagacaaac 


acttggtgcg 


atgttcatac 


180 


ctacagacat 


gtcaacgggt 


gttagacgac 


ggtttcttgc 


aaagacaggt 


gttggcatct 


240 


cgtacgatgg 


caactgcagg 


aggtgtcgac 


ttctccttta 


ggcaatagaa 


aaagactaag 


300 
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agaacagcgt 


ttttacaggt 


tgcattggtt 


aatgtagtat 


ttttttagtc 


ccagcattct 


360 


gtgggttgct 


ctgggtttct 


agaataggaa 


atcacaggag 


aatgcaaatt 


cagatggaag 


420 


aacaaagaga 


taaaaaacaa 


aaaaaaactg 


agttttgcac 


caatagaatg 


tttgatgata 


480 


tcatccactc 


gctaaacgaa 


tcatgtgggt 


gatcttctct 


ttagttttgg 


tctatcataa 


540 


aacacatgaa 


agtgaaatcc 


aaatacacta 


cactccgggt 


attgtccttc 


gttttacaga 


600 


tgtctcattg 


tcttactttt 


gaggtcatag 


gagttgcctg 


tgagagatca 


cagagattat 


660 


cacactcaca 


tttatcgtag 


tttcctatct 


catgctgtgt 


gtctctggtt 


ggttcatgag 


720 


tttggattgt 


tgtacattaa 


aggaatcgct 


ggaaagcaaa 


gctaactaaa 


ttttctttgt 


780 


cacaggtaca 


ctaacctgta 


aaacttcact 


gccacgccag 


tctttcctga 


ttgggcaagt 


840 


gcacaaacta 


caacctgcaa 


aacagcactc 


cgcttgtcac 


aggttgtctc 


ctctcaacca 


900 


acaaaaaaat 


aagattaaac 


tttctttgct 


catgcatcaa 


tcggagttat 


ctctgaaaga 


960 


gttgcctttg 


tgtaatgtgt 


gccaaactca 


aactgcaaaa 


ctaaccacag 


aatgatttcc 


1020 


ctcacaatta 


tataaactca 


cccacatttc 


cacagaccgt 


aatttcatgt 


ctcactttct 


1080 


cttttgctct 


tcttttactt 


agtcaggttt 


gataacttcc 


ttttttatta 


ccctatctta 


1140 


tttatttatt 


tattcattta 


taccaaccaa 


ccaaccatgg 


ccacacaaga 


aatcatcgat 


1200 


tctgtacttc 


cgtacttgac 


caaatggtac 


actgtgatta 


ctgcagcagt 


attagtcttc 


1260 


cttatctcca 


caaacatcaa 


gaactacgtc 


aaggcaaaga 


aattgaaatg 


tgtcgatcca 


1320 


ccatacttga 


aggatgccgg 


tctcactggt 


attctgtctt 


tgatcgccgc 


catcaaggcc 


1380 


aagaacgacg 


gtagattggc 


taactttgcc 


gatgaagttt 


tcgacgagta 


cccaaaccac 


1440 


accttctact 


tgtctgttgc 


cggtgctttg 


aagattgtca 


tgactgttga 


cccagaaaac 


1500 


atcaaggctg 


tcttggccac 


ccaattcact 


gacttctcct 


tgggtaccag 


acacgcccac 


1560 


tttgctcctt 


tgttgggtga 


cggtatcttc 


accttggacg 


gagaaggttg 


gaagcactcc 


1620 


agagctatgt 


tgagaccaca 


gtttgctaga 


gaccagattg 


gacacgttaa 


agccttggaa 


1680 


ccacacatcc 


aaatcatggc 


taagcagatc 


aagttgaacc 


agggaaagac 


tttcgatatc 


1740 


caagaattgt 


tctttagatt 


taccgtcgac 


accgctactg 


agttcttgtt 


tggtgaatcc 


1800 


gttcactcct 


tgtacgatga 


aaaattgggc 


at cccaact c 


caaacgaaat 


cccaggaaga 


1860 


gaaaactttg 


ccgctgcttt 


caacgtttcc 


caacactact 


tggccaccag 


aagttactcc 


1920 


cagacttttt 


actttttgac 


caaccctaag 


gaattcagag 


actgtaacgc 


caaggtccac 


1980 


cacttggcca 


agtactttgt 


caacaaggcc 


ttgaacttta 


ctcctgaaga 


actcgaagag 


2040 


aaatccaagt 


ccggttacgt 


tttcttgtac 


gaattggtta 


agcaaaccag 


agatccaaag 


2100 



-112- 



gtcttgcaag 


atcaattgtt 


gaacattatg 


gttgccggaa 


gagacaccac 


tgccggtttg 


2160 


ttgtcctttg 


ctttgtttga 


attggctaga 


cacccagaga 


tgtggtccaa 


gttgagagaa 


2220 


gaaatcgaag 


ttaactttgg 


tgttggtgaa 


gactcccgcg 


tt-gaagaaat 


taccttcgaa 


2280 


gccttgaaga 


gatgtgaata 


cttgaaggct 


atccttaacg 


aaaccttgcg 


tatgtaccca 


2340 


tctgttcctg 


tcaactttag 


aaccgccacc 


agagacacca 


ctttgccaag 


aggtggtggt 


2400 


gctaacggta 


ccgacccaat 


ctacattcct 


aaaggctcca 


ctgttgctta 


cgttgtctac 


2460 


aagacccacc 


gtttggaaga 


atactacggt 


aaggacgcta 


acgacttcag 


accagaaaga 


2520 


tggtttgaac 


catctactaa 


gaagttgggc 


tgggcttatg 


ttccattcaa 


cggtggtcca 


2580 


agagtctgct 


tgggtcaaca 


attcgccttg 


actgaagctt 


cttatgtgat 


cactagattg 


2640 


gcccagatgt 


ttgaaactgt 


ctcatctgat 


ccaggtctcg 


aataccctcc 


accaaagtgt 


2700 


attcacttga 


ccatgagtca 


caacgatggt 


gtctttgt ca 


agatgtaaag 


tagtcgatgc 


2760 


tgggtattcg 


attacatgtg 


tataggaaga 


ttttggtttt 


ttattcgttc 


ttttttttaa 


2820 


tttttgttaa 


attagtttag 


agatttcatt 


aatacataga 


tgggtgctat 


ttccgaaact 


2880 


ttacttctat 


ccGctgtatc 


ccttattatc 


cctctcagtc 


acatgattgc 


tgtaattgtc 


2940 


gtgcaggaca 


caaactccct 


aacggactta 


aaccataaac 


aagctcagaa 


ccataagccg 


3000 


acatcactcc 


ttcttctctc 


ttctccaacc 


aatagcatgg 


acagacccac 


cctcctatcc 


3060 


gaatcgaaga 


cGcttattga 


ctccataccc 


acctggaagc 


ccctcaagcc 


acacacgtca 


3120 


tccagcccac 


ccatcaccac 


atccctctac 


tcgacaacgt 


ccaaagacgg 


cgagttctgg 


3180 


tgtgcccgga 


aatcagccat 


cccggccaca 


tacaagcagc 


cgttgattgc 


gtgcatactc 


3240 


ggcgagccca 


caatgggagc 


cacgcattcg 


gaccatgaag 


caaagtacat 


tcacgagatc 


3300 


acgggtgttt 


cagtgtcgca 


gattgagaag 


ttcgacgatg 


gatggaagta 


cgatctcgtt 


3360 


gcggattacg 


acttcggtgg 


gttgttatct 


aaacgaagat 


tctatgagac 


gcagcatgtg 


3420 


tttcggttcg 


aggattgtgc 


gtacgtcatg 


agtgtgcctt 


ttgatggacc 


caaggaggaa 


3480 


ggttacgtgg 


ttgggacgta 


cagatccatt 


gaaaggttga 


gctggggtaa 


agacggggac 


3540 


gtggagtgga 


ccatggcgac 


gacgtcggat 


cctggtgggt 


ttatcccgca 


atggataact 


3600 


cgattgagca 


tccctggagc 


aatcgcaaaa 


gatgtgccta 


gtgtattaaa 


ctacatacag 


3660 


aaataaaaac 


gtgtcttgat 


tcattggttt 


ggttcttgtt 


gggttccgag 


ccaatatttc 


3720 


acatcatctc 


ctaaattctc 


caagaatccc 


aacgtagcgt 


agtccagcac 


gccctctgag 


3780 


atcttattta 


atatcgactt 


ctcaaccacc 


ggtggaatcc 


cgttcagacc 


attgttacct 


3840 


gtagtgtgtt 


tgctcttgtt 


cttgatgaca 


atgatgtatt 


tgtcacgata 


cctgaaataa 


3900 
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taaaacatcc 


agtcattgag cttattactc 


gtgaacttat 


gaaagaactc 


attcaagccg 


3960 


ttcccaaaaa 


acccagaatt 


gaagatcttg 


ctcaactggt 


catgcaagta 


gtagatcgcc 


4020 


atgatctgat 


actttaccaa 


gctatcctct 


ccaagttctc 


ccacgtacgg 


caagtacggc 


4080 


aacgagctct 


ggaagctttg 


ttgtttgggg 


tcata 
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<400> 86 
gacctgtgac 


gcttccggtg 


tcttgccacc 


agtctccaag 


ttgaccgacg 


cccaagtcat 


60 


gtaccacttt 


atttccggtt 


acacttccaa 


gatggctggt 


actgaagaag 


gtgtcacgga 


120 


accacaagct 


actttctccg 


cttgtttcgg 


tcaaccattc 


ttggtgttgc 


acccaatgaa 


180 


gtacgctcaa 


caattgtctg 


acaagatctc 


gcaacacaag 


gctaacgcct 


ggttgttgaa 


240 


caccggttgg 


gttggttctt 


ctgctgctag 


aggtggtaag 


agatgctcat 


tgaagtacac 


300 


cagagccatt 


ttggacgcta 


tccactctgg 


tgaattgtcc 


aaggttgaat 


acgaaacttt 


360 


cccagtcttc 


aacttgaatg 


tcccaacctc 


ctgtccaggt 


gtcccaagtg 


aaatcttgaa 


420 


cccaaccaag 


gcctggaccg 


gaaggtgttg 


actccttcaa 


caaggaaatc 


aagtctttgg 


480 


ctggtaagtt 


tgctgaaaac 


ttcaagacct 


atgctgacca 


agctaccgct 


gaagtgagag 


540 


ctgcaggtcc 


agaagcttaa 


agatatttat 


tcattattta 


gtttgcctat 


ttatttctca 


600 


ttacccatca 


tcattcaaca 


ctatatataa 


agttacttcg 


gatatcattg 


taatcgtgcg 


660 


tgtcgcaatt 


ggatgatttg 


gaactgcgct 


tgaaacggat 


tcatgcacga 


agcggagata 


720 


aaagattacg 


taatttatct 


cctgagacaa 


ttttagccgt 


gttcacacgc 


ccttctttgt 


780 


tctgagcgaa 


ggataaataa 


ttagacttcc 


acagctcatt 


ctaatttccg 


tcacgcgaat 


840 


attgaagggg 


ggtacatgtg 


gccgctgaat 


gtgggggcag 


taaacgcagt 


ctctcctctc 


900 


ccaggaatag 


tgcaacggag 


gaaggataac 


ggatagaaag 


cggaatgcga 


ggaaaatttt 


960 


gaacgcgcaa 


gaaaagcaat 


atccgggcta 


ccaggttttg 


agccagggaa 


cacactccta 


1020 


tttctgctca 


atgactgaac 


atagaaaaaa 


caccaagacg 


caatgaaacg 


cacatggaca 


1080 


tttagacctc 


cccacatgtg 


atagtttgtc 


ttaacagaaa 


agtataataa 


gaacccatgc 


1140 


cgtccctttt 


ctttcgccgc 


ttcaactttt 


ttttttttat 


cttacacaca 


tcacgaccat 


1200 


gactgtacac 


gatattatcg 


ccacatactt 


caccaaatgg 


tacgtgatag 


taccactcgc 


1260 


tttgattgct 


tatagagtcc 


tcgactactt 


ctatggcaga 


tacttgatgt 


acaagcttgg 


1320 
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4" 4" — > 

ngcraascca 


4^ ^ *H "t" 4" r*^ *3 rr 

c L c L Lccaga 


aacagacaga 


cggc ug u uuc 


uydUUL-dddy 


ou^oyou v^yci 


1380 


ra "t" PT'f" rra zi rr 

ciL.Lyuuya.ciy 


ddydciyciy L,y 


2 3 z"' /"^ /~< +- 

dCyy UdCCUL 


CaUdyaCUuC 


dCcio u^^L^dy*^ 


(TI" PiiT'PriPO'a 


1440 


uCuCy aucgu 


cccga uau CC 


caacuuucac 


d u ucccgguc 


U UU U CCd Uvwd 


d^_r^u. u.v^a-ci 


1500 


t acccttgag 


ccggagaaca 


tcaaggccat 


cttggccact 


caguucaacg 


dU U UU-UL^^^U U 


1560 


gggtaccaga 


cactcgcact 


uugcuccu U U 


gttgggtgat 


gguaucu u ua 


cgu uyyduyy 




cgccggctgg 


aagcacagca 


gatctatgtt 


gagaccacag 


tttgccagag 


aacagduuuc 


±.\JQ\J 


CCaCguCaag 


u uguuygagc 


CdCdcgu uca 


gguguucuuc 


dddCdCy UCd 


ydddy yod^d 


1740 


gggcaagacu 


u u ugaco. ucc 


aggaa uugu u 


^ 4^ 4* ^ *^ 4* 4* 

u uucagau ug 


accyucydcu 


o^^yo^i-'d^s-'yd 


1800 


guu u uugu u u 


gg ugaauccg 


uugaguccu u 


gagaga ugaa 


4~ y-i 4- ^ 4- d 

ucuducggcci 


L.y i„t_-v_-d u<w.ad 


1860 


t gcgcttgac 


tttgacggca 


aggctggctt 


tgctgatgct 


uu uaacua ui. 


cyCciycidU ud 




tttggcttcg 


agagcggtta 


tgcaacaatt 


gtactgggtg 


ttgaacggga 


aaaagttt aa 




ggagtgcaac 


gctaaagtgc 


acaagtttgc 


tgactactac 


gtcaacaagg 


^ ^ 4" ^ 4" 4^ 

cu uuggacuu 


Z U U 


gacgcctgaa 


caattggaaa 


agcaggatgg 


ttatgtgttt 


ttgtacgaat 


t ggt caagca 


0 1 on 

Z X u u 


aaccagagac 


aagcaagtgt 


tgagagacca 


attgttgaac 


atcatggttg 


diggtagaga 


Z i oU 


caccaccgcc 


gguu ugt: ugu 


cgt uugtttu 


ct utgaattg 


gccagaaacc 


cagaay u uac 


z z u 


caacaagttg 


agagaagaaa 


ttgaggacaa 


gtttggactc 


ggtgagaatg 


ctagt gttga 


Z Z 0 u 


agacattt cc 


tttgagtcgt 


tgaagtcctg 


tgaatacttg 


aaggctgttc 


t caacgaaac 


Z ^4 U 


cttgagattg 


tacccatccg 


tgccacagaa 


tt ucagagtt 


gccaccaaga 


acactaccct 




cccaagaggt 


9^tggtaagg 


^cgggttgtc 


tcctgttttg 


gtgagaaagg 


gtcagaccgt 


Z 4 OU 


tat: u tacggu 


gtct acgcag 


cccacagaaa 


cccagctgtt 


tacggtaagg 


acgct cttga 


9 R9n 

Z J zu 


gtttagacca 


gagagatggt 


ttgagccaga 


gacaaagaag 


cttggctggg 


cctt cot ccc 


Z 3 O U 


attcaacggt 


ggtccaagaa 


tctgtttggg 


acagcagttt 


gccttgacag 


aagctt cgta 


Z 04 U 


tgtcactgtc 


aggttgctcc 


aggagtttgc 


acacttgtct 


atggacccag 


acaccgaata 


9 1 n n 
Z / u u 


tccacctaag 


aaaatgtcgc 


atttgaccat 


gtcgcttttc 


gacggtgcca 


atattgagat 


9 n C Pi 

z / oU 


gtattagagg 


gtcatgtgtt 


attttgattg 


tttagtttgt 


aattactgat 


taggttaatt 


O O O A 

zSzu 


catggattgt 


tatttattga 


taggggtttg 


cgcgtgttgc 


attcacttgg 


gatcgtt oca 


9 fi p n 

Z 0 0 u 


ggttgatgtt 


tccttccatc 


ctgtcgagtc 


aaaaggagtt 


ttgttttgta 


actccggacg 


2940 


atgttttaaa 


tagaaggtcg 


atctccatgt 


gattgttttg 


actgttactg 


tgattatgta 


3000 


atctgcggac 


gttatacaag 


catgtgattg 


tggttttgca 


gccttttgca 


cgacaaatga 


3060 


tcgtcagacg 


attacgtaat 


ctttgttaga 


ggggtaaaaa 


aaaacaaaat 


ggcagccaga 


31-20 



-115- 







^ ^ 4* :3 -3 ^ ^ 

aa LgCaaaaa 


augggaaac L 


CCaaCaya.Oa. 


CtCluCtClClClClCtCL 


3180 




/-*4— ^ O >— 1 

cx-ccgaaccG 


acagaacaat 


ggggcgccag 


aau ua L ugac 




J j£i *3 VJ 


4-4-4*4-4- -3<^f-^/-^4- 

lc L L LdCgcr 


aacgctcatt 


gcagtgtagt 


gcgtctt aca 


cggggtattg 


/^4-4-4"(^4*d/^^3 


0 J u U 




cagttgaagg 


tttgcaccta 


acgttgcccc 


gt gtcaactc 


aa cT-ugacga 


^ J D U 


gtaacttcct 


aagctcgaat 


tatgcagctc 


gtgcgtcaac 


ctatgtgcag 


gaaagaaaaa 




atccaaaaaa 


atcgaaaatg 


cgactttcga 


ttttgaataa 


accaaaaaga 


aaaatgtcgc 


J 0 u 


•^/^4-4-4-4-4-4- /-.+• 


cgctctcgct 


ctctcgaccG 


aaatcacaac 


aaatcct cgc 


gcgcagtatt 




tcgacgaaac 


cacaacaaat 


aaaaaaaaca 


aattctacac 


cacttcttt c 


tct tcaccag 


J bu u 


tcaacaaaaa 


acaacaaatt 


atacaccatt 


tcaacgattt 


titgcLCLtat 


aaa ugc ua xza 


0 0 DU 


caaLggtu ca 


attcaactca 


ggtatgttta 


ttttactgtt 


ttcagctcaa 


gtatgttcaa 


*5 T 0 n 
0 /zu 


atactaacta 


cttttgatgt 


ttgtcgcttt 


tctagaatca 


aaacaacgcc 


cacaacacgc 


3780 


cgagcttgtc 


gaatagacgg 


tttgtttact 


cattagatgg 


tcccagatta 


cttttcaagc 


3840 


caaagtctct 


cgagttttgt 


ttgctgtttc 


cccaattcct 


aactatgaag 


ggtttttata 


3900 


aggtccaaag 


accccaaggc 


atagtttttt 


tggttccttc 


ttgtcgtg 
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<400> 87 
gctcaacaat 


tgtctgacaa 


gatctcgcaa 


cacaaggcta 


acgcctggtt 


gttgaacact 




ggttgggttg 


gttcttctgc 


tgctagaggt 


ggtaagagat 


gttcattgaa 


gtacaccaga 


Iz U 


gccattttgg 


acgctatcca 


ctctggtgaa 


ttgtccaagg 


ttgaatacga 


gactttccca 


loO 


gtcttcaact 


tgaatgtccc 


aacctcctgc 


ccaggtgtcc 


caagtgaaat 


cttgaaccca 


z4 0 


accaaggcct 


ggaccgaagg 


tgttgactcc 


ttcaacaagg 


aaatcaagtc 


tttggctggt 


300 


aagtttgctg 


aaaacttcaa 


gacctatgct 


gaccaagcta 


ccgctgaagt 


tagagctgca 


360 


ggtccagaag 


cttaaagata 


tttattcact 


atttagtttg 


cctatttatt 


tctcatcacc 


420 


catcatcatt 


caacaatata 


tataaagtta 


tttcggaact 


catatatcat 


tgtaatcgtg 


480 


cgtgttgcaa 


ttgggtaatt 


tgaaactgta 


gttggaacgg 


attcatgcac 


gatgcggaga 




taacacgaga 


ttatctccta 


agacaatttt 


ggcctcattc 


acacgccctt 


cttctgagct 


600 


aaggataaat 


aattagactt 


cacaagttca 


ttaaaatatc 


cgtcacgcga 


aaactgcaac 


660 


aataaggaag 


gggggggtag 


acgtagccga 


tgaatgtggg 


gtgccagtaa 


acgcagtctc 


720 


tctctccccc 


cccccccccc 


ccccctcagg 


aatagtacaa 


cgggggaagg 


ataacggata 


780 
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gcaagtggaa 


tgcgaggaaa 


attttgaatg 


cgcaaggaaa 


gcaatatccg 


ggctatcagg 


840 


ttttgagcca 


ggggacacac 


tcctcttctg 


cacaaaaact 


taacgtagac 


aaaaaaaaaa 


. 900 


aactccacca 


agacacaatg 


aatcgcacat 


ggacatttag 


acctccccac 


atgtgaaagc 


960 


ttctctggcg 


aaagcaaaaa 


aagtataata 


aggacccatg 


ccttccctct 


tcctgggccg 


1020 


tttcaacttt 


ttctttttct 


ttgtctatca 


acacacacac 


acctcacgac 


catgactgca 


1080 


caggatatta 


tcgccacata 


catcaccaaa 


tggtacgtga 


tagtaccact 


cgctttgatt 


1140 


gcttataggg 


tcctcgacta 


cttttacggc 


agatacttga 


tgtacaagct 


tggtgctaaa 


1200 


ccgtttttcc 


agaaacaaac 


agacggttat 


ttcggattca 


aagctccact 


tgaattgtta 


1260 


aaaaagaaga 


gtgacggtac 


cctcatagac 


ttcactctcg 


agcgtatcca 


agcgctcaat 


1320 


cgtccagata 


tcccaacttt 


tacattccca 


atcttttcca 


tcaaccttat 


cagcaccctt 


1380 


gagccggaga 


acatcaaggc 


tatcttggcc 


acccagttca 


acgatttctc 


cttgggcacc 


1440 


agacactcgc 


actttgctcc 


tttgttgggc 


gatggtatct 


ttaccttgga 


cggtgccggc 


1500 


tggaagcaca 


gcagatctat 


gttgagacca 


cagtttgcca 


gagaacagat 


ttcccacgtc 


1560 


aagttgttgg 


agccacacat 


gcaggtgttc 


ttcaagcacg 


tcagaaaggc 


acagggcaag 


1620 


acttttgaca 


tccaagaatt 


gtttttcaga 


ttgaccgtcg 


actccgccac 


tgagtttttg 


1680 


tttggtgaat 


ccgttgagtc 


cttgagagat 


gaatctattg 


ggatgtccat 


caatgcactt 


1740 


gactttgacg 


gcaaggctgg 


ctttgctgat 


gcttttaact 


actcgcagaa 


ctatttggct 


1800 


tcgagagcgg 


ttatgcaaca 


attgtactgg 


gtgttgaacg 


ggaaaaagtt 


taaggagtgc 


1860 


aacgctaaag 


tgcacaagtt 


tgctgactat 


tacgtcagca 


aggctttgga 


cttgacacct 


1920 


gaacaattgg 


aaaagcagga 


tggttatgtg 


ttcttgtacg 


agttggtcaa 


gcaaaccaga 


1980 


gacaggcaag 


tgttgagaga 


ccagttgttg 


aacatcatgg 


ttgccggtag 


agacaccacc 


2040 


gccggtttgt 


tgtcgtttgt 


tttctttgaa 


ttggccagaa 


acccagaggt 


gaccaacaag 


2100 


ttgagagaag 


aaatcgagga 


caagtttggt 


cttggtgaga 


atgctcgtgt 


tgaagacatt 


2160 


tcctttgagt 


cgttgaagtc 


atgtgaatac 


ttgaaggctg 


ttctcaacga 


aactttgaga 


2220 


ttgtacccat 


ccgtgccaca 


gaatttcaga 


gttgccacca 


aaaacactac 


ccttccaagg 


2280 


ggaggtggta 


aggacgggtt 


atctcctgtt 


ttggtcagaa 


agggtcaaac 


cgttatgtac 


2340 


ggtgtctacg 


ctgcccacag 


aaacccagct 


gtctacggta 


aggacgccct 


tgagtttaga 


2400 


ccagagaggt 


ggtttgagcc 


agagacaaag 


aagcttggct 


gggccttcct 


tccattcaac 


2460 


ggtggtccaa 


gaatttgctt 


gggacagcag 


tttgccttga 


cagaagcttc 


gtatgtcact 


2520 


gtcagattgc 


tccaagagtt 


tggacacttg 


tctatggacc 


ccaacaccga 


atatccacct 


2580 
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aggaaaatgt 


cgcatttgac 


catgtccctt 


ttcgacggtg 


ccaacattga 


gatgtattag 


2640 


aggatcatgt 


gttatttttg 


attggtttag 


tctgtttgta gctattgatt 


aggttaattc 


2700 


acggattgtt 


atttattgat 


agggggtgcg 


tgtgtgtgtg 


tgtgttgcat 


tcacatggga 


2760 


tcgttccagg 


ttgttgtttc 


cttccatcct 


gttgagtcaa 


aaggagtttt 


gttttgtaac 


2820 


tccggacgat 


gtcttagata 


gaaggtcgat 


ctccatgtga 


ttgtttgact 


gctactctga 


2880 


ttatgtaatc 


tgtaaagcct 


agacgttatg 


caagcatgtg attgtggttt 


ttgcaacctg 


2940 


tttgcacgac 


aaatgatcga 


cagtcgatta 


cgtaatccat 


attatttaga 


ggggtaataa 


3000 


aaaataaatg 


gcagccagaa 


tttcaaacat 


tttgcaaaca 


atgcaaaaga 


tgagaaactc 


3060 


caacagaaaa 


aataaaaaaa 


ctccgcagca 


ctccgaacca 


acaaaacaat 


ggggggcgcc 


3120 


agaattattg 


actattgtga 


ctttttttta 


ttttttccgt 


taactttcat 


tgcagtgaag 


3180 


tgtgttacac 


ggggtggtga 


tggtgttggt 


ttctacaatg caagggcaca 


gttgaaggtt 


3240 


tccacataac 


gttgcaccat 


atcaactcaa 


tttatcctca 


ttcatgtgat 


aaaagaagag 


3300 


ccaaaaggta 


attggcagac 


cccccaaggg 


gaacacggag 


tagaaagcaa 


tggaaacacg 


3360 


cccatgacag 


tgccatttag 


cccacaacac 


atctagtatt 


cttttttttt 


tttgtgcgca 


3420 


ggtgcacacc 


tggactttag 


ttattgcccc 


ataaagttaa 


caatctcacc 


tttggctctc 


3480 


cca gtgtctc 


cgcctccaga 


tgctcgtttt 


acaccctcga 


gctaacgaca 


acacaacacc 


3540 


catgagggga 


atgggcaaag 


ttaaacactt 


ttggtttcaa 


tgattcctat 


ttgctactct 


3600 


cttgttttgt 


gttttgattt 


gcaccatgtg 


aaataaacga 


caattatata 


taccttttcg 


3660 


tctgtcctcc 


aatgtctctt 


tttgctgcca 


ttttgctttt 


tgctttttgc 


ttttgcactc 


3720 


tctcccactc 


ccacaatcag 


tgcagcaaca 


cacaa 






3755 



<210> 88 
<211> 3900 
<212> DNA 

<213> Candida tropicalis 
<400> 88 

gacatcataa tgacccggtt atttcgccct caggttgctt atttgagccg taaagtgcag 60 
tagaaacttt gccttgggtt caaactctag tataatggtg ataactggtt gcactcttgc 120 
cataggcatg aaaataggcc gttatagtac tatatttaat aagcgtagga gtataggatg 180 
catatgaccg gtttttctat atttttaaga taatctctag taaattttgt attctcagta 240 
ggatttcatc aaatttcgca accaattctg gcgaaaaaat gattctttta cgtcaaaagc 300 
tgaatagtgc agtttaaagc acctaaaatc acatatacag cctctagata cgacagagaa 360 
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cctatatctc 


tttttgctac 
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uracgutrcTi 


i Uo u 


tgct tttgcs 


ctctcccact 


cccacaaaga 


aaaaaaaact 


acactatgtc 


gtcttct cca 


1 1 /I n 
J.X4U 


t cgtttgccc 


aagaggttct 


cgctaccact 


agtccttaca 


tcgagtactt 


tcttgacaac 


1 o n n 


tacaccagat 


ggtactactt 


catacctttg 


gtgcttcttt 


cgttgaactt 


tataagtttg 


1260 


ct ccacacaa 


ggtacttgga 


acgcaggttc 


cacgccaagc 


cactcggtaa 


ctttgtcagg 


T o o n 


gaccctacgt 


ttggtatcgc 


tactccgttg 


cttttgatct 


acttgaagtc 


gaaaggtacg 


T O O A 


yi-ca LyaayT- 


ttgcttgggg 


cctctggaac 


aacaagtaca 


tcgtcagaga 


cccaaagtac 


T / /I A 
1 4 4 U 


aay acaac cy 


ggctcaggat 


tgttggcctc 


ccattgattg 


aaaccatgga 


cccagagaac 


1 c n A 


atcaaggctg 


ttttggctac 


tcagttcaat 


gatttctctt 


tgggaaccag 


acacgatttc 


T c A 


ttgtact cct 


tgttgggtga 


cggtattttc 


accttggacg 


gtgctggctg 


gaaacatagt 


1 A A 

1520 


agaactatgt 


tgagaccaca 


gtttgctaga 


gaacaggttt 


ctcacgtcaa 


gttgttggag 


1680 


ccacacgt t c 


aggtgttctt 


caagcacgtt 


agaaagcacc 


gcggtcaaac 


gttcgacatc 


1740 


caagaattgt 


tcttcaggtt 


gaccgtcgac 


tccgccaccg 


agttcttgtt 


tggtgagtct 


1 O A A 

1800 


gctgaat cct 


tgagggacga 


atctattgga 


ttgaccccaa 


ccaccaagga 


tttcgatggc 


1860 


agaagagat t 


t cgctgacgc 


tttcaactat 


tcgcagactt 


accaggccta 


cagatttttg 


1920 


ttgcaacaaa 


tgtactggat 


cttgaatggc 


tcggaattca 


gaaagtcgat 


tgctgtcgtg 


1980 


cacaagtttg 


ctgaccacta 


tgtgcaaaag 


gctttggagt 


tgaccgacga 


tgacttgcag 


2040 


aaacaagacg 


gctatgtgtt 


cttgtacgag 


ttggctaagc 


aaaccagaga 


cccaaaggtc 


2100 


ttgagagacc 


agttattgaa 


cattttggtt 


gccggtagag 


acacgaccgc 


cggtttgttg 


2160 
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tcat;t.t9*^tt: 


tctacgagtt 


gtcaagaaac 


cctgaggtgt 


ttgctaagtt 


gagagaggag 


o o o ri 


gtggaaaaca 


gatttggact 


cggtgaagaa 


gctcgtgttg 


aagagatctc 


gtttgagtcc 


O O O A 


ttgaagtctt 


gtgagtactt 


gaaggctgtc 


atcaatgaaa 


ccttgagatt 


gtacccatcg 


Zo40 


gttccacaca 


actttagagt 


tgctaccaga 


aacactaccc 


tcccaagagg 


tggtggtgaa 


2400 


gatggatact 


cgccaattgt 


cgtcaagaag 


ggtcaagttg 


tcatgtacac 


tgttattgct 


24 oO 


acccacagag 


acccaagtat 


ctacggtgcc 


gacgctgacg 


tcttcagacc 


agaaagatgg 


O C T A 


tttgaaccag 


aaactagaaa gttgggctgg gcatacgttc 


cattcaatgg 


tggtccaaga 


2580 


atctgtttgg 


gtcaacagtt 


tgccttgacc 


gaagcttcat 


acgtcactgt 


cagattgctc 


2640 


caggagtttg 


cacacttgtc tatggaccca gacaccgaat 


atccaccaaa 


attgcagaac 


2700 


accttgacct 


X, g c cgc u c t, u 


tgatggtgct 


gatgtt agaa 


tgtactaagg 


ttgcttttcc 


O T A 


ttgctaattt 




agcx. Lgugna 


tttaaattga 


atcggcaatt 


gatttttctg 


O O A A 


ataccaataa 


CL.gi,d.g ugcg 


atttgaccaa 


aaccgttcaa 


actttttgtt 


ctctcgttga 


O O O A 

zooO 


cgtgctcgct 


cat cagcact 


gtttgaagac 


gaaagagaaa 


attttttgta 


aacaacactg 


2940 


tccaaattta 


cccaacgtga 


accattatgc 


aaatgagcgg 


ccctttcaac 


tggtcgctgg 


3000 


aagcattcgg 


ggatatctac 


aacgccctta 


agtttgaaac 


agacattgat 


ttagacacca 


3060 


tagatttcag 


cggcatcaag 


aatgaccttg 


cccacatttt 


gacgacccca 


acaccactgg 


3120 


aagaatcacg 


ccagaaacta 


ggcgatggat 


ccaagcctgt 


gaccttgccc 


aatggagacg 


3180 


aagtggagtt 


gaaccaagcg 


ttcctagaag 


ttaccacatt 


attgtcgaat 


gagtttgact 


3240 


tggaccaatt 


gaacgcggca 


gagttgttat 


actacgctgg 


cgacatatcc 


tacaagaagg 


3300 


gcacatcaat 


cgcagacagt 


gccagattgt 


cttattattt 


gagagcaaac 


tacatcttga 


3360 


acatacttgg 


gtatttgatt 


tcgaagcagc 


gattggattt 


gatagtcacg 


gacaacgacg 


3420 


cgttgtttga 


tagtattttg 


aaaagttttg 


aaaagatcta 


caagttgata 


agcgtgttga 


3480 


acgatatgat 


tgacaagcaa 


aaggtgacaa 


gcgacatcaa 


cagtctagca 


ttcatcaatt 


3540 


gcatcaacta 


ctcgagaggt 


caactattct 


ccgcacacga 


acttttggga 


ctggttttgt 


3600 


ttggattggt 


cgacatctat 


ttcaaccagt 


ttggcacatt 


agacaactac 


aagaaggtat 


3660 


tggcattgat 


actgaagaac atcagcgatg 


aagacatctt 


gatcatacac 


ttcctcccat 


3720 


cgacactaca 


attgtttaag 


ctggtgttgg 


acaagaaaga 


cgacgctgca 


gttgaacagt 


3780 


tctacaagta 


catcacttca 


acagtgtcac 


gagactacaa 


ctccaacatc 


ggctccacag 


3840 


ccaaagatga 


tatcgatttg tccaaaacca 


aactcagtgg 


ctttgaggtg 


ttgacgagtt 


3900 



<210> 89 
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<211> 3668 

<212> DNA 

<213> Candida tropicalis 

<400> 89 



cctgcagaat 


tcgcggccgc 


gtcgacagag 


tagcagttat 


gcaagcatgt 


gattgtggtt 


60 


tttgcaacct 


gtttgcacga 


caaatgatcg 


acagtcgatt 


acgtaatcca 


tattatttag 


120 


aggggtaata 


aaaaataaat 


ggcagccaga 


atttcaaaca 


ttttgcaaac 


aatgcaaaag 


180 


atgagaaact 


ccaacagaaa 


aaataaaaaa 


actccgcagc 


actccgaacc 


aacaaaacaa 


240 


tggggggcgc 


cagaattatt 


gactattgtg 


actttttttt 


attttttccg 


ttaactttca 


300 


ttgcagtgaa 


gtgtgttaca 


cggggtggtg 


atggtgttgg 


tttctacaat 


gcaagggcac 


360 


agttgaaggt 


ttccacataa 


cgttgcacca 


tatcaactca 


atttatcctc 


attcatgtga 


420 


taaaagaaga 


gccaaaaggt 


aattggcaga 


ccccccaagg 


ggaacacgga 


gtagaaagca 


480 


atggaaacac 


gcccatgaca 


gtgccattta 


gcccacaaca 


catctagtat 


tctttttttt 


540 


ttttgtgcgc 


aggtgcacac 


ctggacttta 


gttattgccc 


cataaagtta 


acaatctcac 


600 


ctttggctct 


cccagtgtct 


ccgcctccag 


atgctcgttt 


tacaccctcg 


agctaacgac 


660 


aacacaacac 


ccatgagggg 


aatgggcaaa 


gttaaacact 


tttggtttca 


atgattccta 


720 


tttgctactc 


tcttgttttg 


tgttttgatt 


tgcaccatgt 


gaaataaacg 


acaattatat 


780 


ataccttttc 


gtctgtcctc 


caatgtctct 


ttttgctgcG 


attttgcttt 

c 


ttgctttttg 


840 


cttttgcact 


ctdtcccact 


cccacaatca 


gtgcagcaac 


acacaaagaa 


gaaaaataaa 


900 


aaaacctaca 


ctatgtcgtc 


ttctccatcg 


tttgctcagg 


aggttctcgc 


taccactagt 


960 


ccttacatcg 


agtactttct 


tgacaactac 


accagatggt 


actacttcat 


ccctttggtg 


1020 


cttctttcgt 


tgaacttcat 


cagcttgctc 


cacacaaagt 


acttggaacg 


caggttccac 


1080 


gccaagccgc 


tcggtaacgt 


cgtgttggat 


cctacgtttg 


gtatcgctac 


tccgttgatc 


1140 


ttgatctact 


taaagtcgaa 


aggtacagtc 


atgaagtttg 


cctggagctt 


ctggaacaac 


1200 


aagtacattg 


tcaaagaccc 


aaagtacaag 


accactggcc 


ttagaattgt 


cggcctccca 


1260 


ttgattgaaa 


ccatagaccc 


agagaacatc 


aaagctgtgt 


tggctactca 


gttcaacgat 


1320 


ttctccttgg 


gaactagaca 


cgatttcttg 


tactccttgt 


tgggcgatgg 


tatttttacc 


1380 


ttggacggtg 


ctggctggaa 


acacagtaga 


actatgttga 


gaccacagtt 


tgctagagaa 


1440 


caggtttccc 


acgtcaagtt 


gttggaacca 


cacgttcagg 


tgttcttcaa 


gcacgttaga 


1500 


aaacaccgcg 


gtcagacttt 


tgacatccaa 


gaattgttct 


tcagattgac 


cgtcgactcc 


1560 


gccaccgagt 


tcttgtttgg 


tgagtctgct 


gaatccttga 


gagacgactc 


tgttggtttg 


1620 


accccaacca 


ccaaggattt 


cgaaggcaga 


ggagatttcg 


ctgacgcttt 


caactactcg 


1680 
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cagacttacc 


aggcctacag 


atttttgttg 


caacaaatgt 


actggatttt 


gaatggcgcg 


1740 


gaattcagaa 


agtcgattgc 


catcgtgcac 


aagtttgctg 


accactatgt 


gcaaaaggct 


1800 


ttggagttga 


ccgacgatga 


cttgcagaaa 


caagacggct 


atgtgttctt 


gtacgagttg 


1860 


gctaagcaaa 


ctagagaccc 


aaaggtcttg 


agagaccagt 


tgttgaacat 


tttggttgcc 


1920 


ggtagagaca 


cgaccgccgg 


tttgttgtcg 


tttgtgttct 


acgagttgtc 


gagaaaccct 


1980 


gaagtgtttg 


ccaagttgag 


agaggaggtg 


gaaaacagat 


ttggactcgg 


cgaagaggct 


2040 


cgtgttgaag 


agatctcttt 


tgagtccttg 


aagtcctgtg 


agtacttgaa 


ggctgtcatc 


2100 


aatgaagcct 


tgagattgta 


cccatctgtt 


ccacacaact 


tcagagttgc 


caccagaaac 


2160 


actacccttc 


caagaggcgg 


tggtaaagac 


ggatgctcgc 


caattgttgt 


caagaagggt 


2220 


caagttgtca 


tgtacactgt 


cattggtacc 


cacagagacc 


caagtatcta 


cggtgccgac 


2280 


gccgacgtct 


tcagaccaga 


aagatggttc 


gagccagaaa 


ctagaaagtt 


gggctgggca 


2340 


tatgttccat 


tcaatggtgg 


tccaagaatc 


tgtttgggtc 


agcagtttgc 


cttgactgaa 


2400 


gcttcatacg 


tcactgtcag 


attgctccaa 


gagtttggaa 


acttgtccct 


ggatccaaac 


2460 


gctgagtacc 


caccaaaatt 


gcagaacacc 


ttgaccttgt 


cactctttga 


tggtgctgac 


2520 


gttagaatgt 


tctaaggttg 


cttatccttg 


ctagtgttat 


ttatagtttg 


tgtatttaaa 


2580 


ttgaatcggc 


gattgatttt 


tctggtacta 


ataactgtag 


tgggttttga 


ccaaaaccgt 


2640 


tcaaactttt 


tttttttttt 


tcttccccct 


accttcgttg 


ctcgctcatc 


agcactgttt 


2700 


gaaaacgaaa 


aaagaaaatt 


ttttgtaaac 


aacattgccc 


aaacttaccc 


aacgtgaacc 


2760 


attataacca 


aatgagcggc 


gctttcaact 


ggtcactgga 


ggcattcggg 


gatatctaca 


2820 


acacccttaa 


gtttgaggaa 


gacattgatt 


tagacaccat 


agatttcagc 


ggcatcaaga 


2880 


atgaccttgt 


ccacattttg 


acaaccccaa 


caccactgga 


agaatcgcgc 


cagaaactag 


2940 


gcgatggatc 


caagcctgtg 
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atgcacgaca 


aggaaactct 


tacaaagaca 


acacttgtgc 


tctgatgcca 


cttgatcttg 
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ctaagcctta 
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ctgccttggt cagcagtttg caatccttga agcttcgtat gttttggctc gattgacaca 2160 

gtgctacacg acgatacagc ttagaactac cgagtaccca ccaaagaaac tcgttcatct 2220 

cacgatgagt cttctcaacg gggtgtacat ccgaactaga acttgattat gtgtttatgg 2280 

ttaatcgggg caaagcactg caagtcattg atgtttgtgg aagcccagca ttggtgttcc 2340 

ggagcatcaa taaccaatgt cttgaagggt ttgattttct tgaccttctt cttcctgagc 2400 

ttctttccgt caaacttgta cagaatggcc atcatttcag gaacaaccac gtacgacggc 24 60 

cggtaccgca tctggagtat ctcgccgtcg ttcaagtagc acgaaaacag caacgacgtc 2520 

accatctgct tcccaatctt gacacccaca gatacccctg cggcttcatg gatcaaaaac 2580 

gtcggcaacc ccgcgtatat gtccatgtaa ttctccatgg ccacctccat caacacactg 2640 

atggagcgac tgacggtgcc accactgccc tcggttgagt caaggcagta tgatgccggg 2700 

atccagtact ccaatgggaa cctctgcacg gtgtcgctgc agtttttgag gcgtatttcg 2760 

atccatgatc gttctttggt gctgtagtat aacgagctct tggtgtcctt gaaatggaac 2820 

aggttggatg tgttgttgag tttgtctgcg tgcttggttt gcaagtcttc gatcgagcgt 2880 

agtgagtaga cagttggcgg gggtggtggc tcgggcttta ttctgtgttt gtgtttcctt 2940 

cttagtcttg gaatgacgct gttatcgacg gttcgtagta taagtagcgc caatatgaga 3000 

atgtatatcc gcatcaccca agactcttca gcctgttaca acgactgagg ctgttggccg 3060 

tgtgaccaat t^gtttcttt ggtgacctag attggtcccg cagggaaagc aagggctgct 3120 

aggggggcat accaaacaag gtcgtgtaat cagtatctat ggtgctacca tgtgtgtggt 318 0 

tggggggaaa ttcccgcatt tttgtgtaac gaaagttcta gaaagttctc gtgggttctg 3240 

agaatctgct ggaaccatcc acccgcattt ccgttgccaa agtgggaaga gcaatcaacc 3300 

caccctgctt tgcccaatca gccattcccc tgggaatata aattcaac 3348 

<210> 95 
<211> 523 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 95 

Met Ala Thr Gin Glu He He Asp Ser Val Leu Pro Tyr Leu Thr Lys 



Trp Tyr Thr Val He Thr Ala Ala Val Leu Val Phe Leu He Ser Thr 
20 25 30 

Asn He Lys Asn Tyr Val Lys Ala Lys Lys Leu Lys Cys Val Asp Pro 
35 40 45 
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Pro Tyr Leu Lys Asp Ala Giy Leu Thr Gly He Leu Ser Leu He Ala 
50 55 60 



Ala He Lys Ala Lys Asn Asp Gly Arg Leu Ala Asn Phe Ala Asp Glu 
65 70 75 80 



Val Phe Asp Glu Tyr Pro Asn His Thr Phe Tyr Leu Ser Val Ala Gly 
85 90 95 



Ala Leu Lys He Val Met Thr Val Asp Pro Glu Asn He Lys Ala Val 
100 105 110 



Leu Ala Thr Gin Phe Thr Asp Phe Ser Leu Gly Thr Arg His Ala His 
115 120 125 



Phe Ala Pro Leu Leu Gly Asp Gly He Phe Thr Leu Asp Gly Glu Gly 
130 135 140 



Trp Lys His Ser Arg Ala Met Leu Arg Pro Gin Phe Ala Arg Asp Gin 
145 150 155 160 



He Gly His Val Lys Ala Leu Glu Pro His He Gin He Met Ala Lys 
165 170 175 



Gin He Lys Leu Asn Gin Gly Lys Thr Phe Asp He Gin Glu Leu Phe 
180 185 190 



Phe Arg Phe Thr Val Asp Thr Ala Thr Glu Phe Leu Phe Gly Glu Ser 
195 200 205 



Val His Ser Leu Tyr Asp Glu Lys Leu Gly He Pro Thr Pro Asn Glu 
210 215 220 



He Pro Gly Arg Glu Asn Phe Ala Ala Ala Phe Asn Val Ser Gin His 
225 230 235 240 



Tyr Leu Ala Thr Arg Ser Tyr Ser Gin Thr Phe Tyr Phe Leu Thr Asn 
245 250 255 



Pro Lys Glu Phe Arg Asp Cys Asn Ala Lys Val His His Leu Ala Lys 
260 265 270 



Tyr Phe Val Asn Lys Ala Leu Asn Phe Thr Pro Glu Glu Leu Glu Glu 
275 280 285 
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Lys Ser Lys Ser Gly Tyr Val Phe Leu Tyr Glu Leu Val Lys Gin Thr 
290 295 300 



Arg Asp Pro Lys Val Leu Gin Asp Gin Leu Leu Asn lie Met Val Ala 
305 310 315 320 



Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala Leu Phe Glu Leu 
325 330 335 



Ala Arg His Pro Glu Met Trp Ser Lys Leu Arg Glu Glu He Glu Val 
340 345 350 



Asn Phe Gly Val Gly Glu Asp Ser Arg Val Glu Glu He Thr Phe Glu 
355 360 365 



Ala Leu Lys Arg Cys Glu Tyr Leu Lys Ala He Leu Asn Glu Thr Leu 
370 375 380 



Arg Met Tyr Pro Ser Val Pro Val Asn Phe Arg Thr Ala Thr Arg Asp 
385 390 395 400 



Thr Thr Leu Pro Arg Gly Gly Gly Ala Asn Gly Thr Asp Pro He Tyr 
405 410 415 



He Pro Lys Gly Ser Thr Val Ala Tyr Val Val Tyr Lys Thr His Arg 
420 425 430 



Leu Glu Glu Tyr Tyr Gly Lys Asp Ala Asn Asp Phe Arg Pro Glu Arg 
435 440 445 



Trp Phe Glu Pro Ser Thr Lys Lys Leu Gly Trp Ala Tyr Val Pro Phe 
450 455 460 



Asn Gly Gly Pro Arg Val Cys Leu Gly Gin Gin Phe Ala Leu Thr Glu 
465 470 475 480 



Ala Ser Tyr Val He Thr Arg Leu Ala Gin Met Phe Glu Thr Val Ser 
485 490 495 



Ser Asp Pro Gly Leu Glu Tyr Pro Pro Pro Lys Cys He His Leu Thr 
500 505 510 



Met Ser His Asn Asp Gly Val Phe Val Lys Met 
515 520 
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<210> 96 
<211> 522 
<212> PRT 

<213> CMDIDATROPICALIS 
<400> 96 

Met Thr Val His Asp He He Ala Thr Tyr Phe Thr Lys Trp Tyr Val 
15 10 15 



He Val Pro Leu Ala Leu He Ala Tyr Arg Val Leu Asp Tyr Phe Tyr 
20 25 30 



Gly Arg Tyr Leu Met Tyr Lys Leu Gly Ala Lys Pro Phe Phe Gin Lys 
35 40 45 



Gin Thr Asp Gly Cys Phe Gly Phe Lys Ala Pro Leu Glu Leu Leu Lys 
50 55 60 



Lys Lys Ser Asp Gly Thr Leu He Asp Phe Thr Leu Gin Arg He His 
65 70 75 80 



Asp Leu Asp Arg Pro Asp He Pro Thr Phe Thr Phe Pro Val Phe Ser 
85 90 95 



He Asn Leu Val Asn Thr Leu Glu Pro Glu Asn He Lys Ala He Leu 
100 105 110 



Ala Thr Gin Phe Asn Asp Phe Ser Leu Gly Thr Arg His Ser His Phe 
115 120 125 



Ala Pro Leu Leu Gly Asp Gly He Phe Thr Leu Asp Gly Ala Gly Trp 
130 ' 135 140 



Lys His Ser Arg Ser Met Leu Arg Pro Gin Phe Ala Arg Glu Gin He 
145 150 155 160 



Ser His Val Lys Leu Leu Glu Pro His Val Gin Val Phe Phe Lys His 
165 170 175 



Val Arg Lys Ala Gin Gly Lys Thr Phe Asp He Gin Glu Leu Phe Phe 
180 185 190 



Arg Leu Thr Val Asp Ser Ala Thr Glu Phe Leu Phe Gly Glu Ser Val 
195 200 205 
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Glu Ser Leu Arg Asp Glu Ser lie Gly Met Ser lie Asn Ala Leu Asp 
210 215 220 



Phe Asp Gly Lys Ala Gly Phe Ala Asp Ala Phe Asn Tyr Ser Gin Asn 
225 230 235 240 



Tyr Leu Ala Ser Arg Ala Val Met Gin Gin Leu Tyr Trp Val Leu Asn 
245 250 255 



Gly Lys Lys Phe Lys Glu Cys Asn Ala Lys Val His Lys Phe Ala Asp 
260 265 270 



Tyr Tyr Val Asn Lys Ala Leu Asp Leu Thr Pro Glu Gin Leu Glu Lys 
275 280 285 



Gin Asp Gly Tyr Val Phe Leu Tyr Glu Leu Val Lys Gin Thr Arg Asp 
290 295 300 



Lys Gin Val Leu Arg Asp Gin Leu Leu Asn lie Met Val Ala Gly Arg 
305 310 315 320 



Asp Thr Thr Ala Gly Leu Leu Ser Phe Val Phe Phe Glu Leu Ala Arg 
325 330 335 



Asn Pro Glu Val Thr Asn Lys Leu Arg Glu Glu lie Glu Asp Lys Phe 
340 345 350 



Gly Leu Gly Glu Asn Ala Ser Val Glu Asp lie Ser Phe Glu Ser Leu 
355 360 365 



Lys Ser Cys Glu Tyr Leu Lys Ala Val Leu Asn Glu Thr Leu Arg Leu 
370 375 380 



Tyr Pro Ser Val Pro Gin Asn Phe Arg Val Ala Thr Lys Asn Thr Thr 
385 390 395 400 



Leu Pro Arg Gly Gly Gly Lys Asp Gly Leu Ser Pro Val Leu Val Arg 
405 410 415 



Lys Gly Gin Thr Val lie Tyr Gly Val Tyr Ala Ala His Arg Asn Pro 
420 425 430 



Ala Val Tyr Gly Lys Asp Ala Leu Glu Phe Arg Pro Glu Arg Trp Phe 
435 440 445 



-137- 



Glu Pro Glu Thr Lys Lys Leu Gly Trp Ala Phe Leu Pro Phe Asn Gly 
450 455 460 



Gly Pro Arg lie Cys Leu Gly Gin Gin Phe Ala Leu Thr Glu Ala Ser, 
465 470 475 480 



Tyr Val Thr Val Arg Leu Leu Gin Glu Phe Ala His Leu Ser Met Asp 
485 490 495 



Pro Asp Thr Glu Tyr Pro Pro Lys Lys Met Ser His Leu Thr Met Ser 
500 505 510 



Leu Phe Asp Gly Ala Asn He Glu Met Tyr 
515 520 

<210> 97 
<211> 522 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 97 

Met Thr Ala Gin Asp He He Ala Thr Tyr He Thr Lys Trp Tyr Val 
15 10 15 



He Val Pro Leu Ala Leu He Ala Tyr Arg Val Leu Asp Tyr Phe Tyr 
20 25 30 



Gly Arg Tyr Leu Met Tyr Lys Leu Gly Ala Lys Pro Phe Phe Gin Lys 
35 40 45 



Gin Thr Asp Gly Tyr Phe Gly Phe Lys Ala Pro Leu Glu Leu Leu Lys 
50 55 60 



Lys Lys Ser Asp Gly Thr Leu He Asp Phe Thr Leu Glu Arg He Gin 
65 70 75 80 



Ala Leu Asn Arg Pro Asp He Pro Thr Phe Thr Phe Pro He Phe Ser 
85 90 95 



He Asn Leu He Ser Thr Leu Glu Pro Glu Asn He Lys Ala He Leu 
100 105 110 



Ala Thr Gin Phe Asn Asp Phe Ser Leu Gly Thr Arg His Ser His Phe 
115 120 125 



Ala Pro Leu Leu Gly Asp Gly He Phe Thr Leu Asp Gly Ala Gly Trp 
130 135 140 



-138- 



Lys His Ser Arg Ser Met Leu Arg Pro Gin Phe Ala Arg Glu Gin lie 
145 150 155 160 



Ser His Val Lys Leu Leu Glu Pro His Met Gin Val Phe Phe Lys His 
165 170 175 



Val Arg Lys Ala Gin Gly Lys Thr Phe Asp He Gin Glu Leu Phe Phe 
180 185 190 



Arg Leu Thr Val Asp Ser Ala Thr Glu Phe Leu Phe Gly Glu Ser Val 
195 200 205 



Glu Ser Leu Arg Asp Glu Ser He Gly Met Ser He Asn Ala Leu Asp 
210 215 220 



Phe Asp Gly Lys Ala Gly Phe Ala Asp Ala Phe Asn Tyr Ser Gin Asn 
225 230 235 240 



Tyr Leu Ala Ser Arg Ala Val Met Gin Gin Leu Tyr Trp Val Leu Asn 
245 250 255 



Gly Lys Lys Phe Lys Glu Cys Asn Ala Lys Val His Lys Phe Ala Asp 
260 265 270 



Tyr Tyr Val Ser Lys Ala Leu Asp Leu Thr Pro Glu Gin Leu Glu Lys 
275 280 285 



Gin Asp Gly Tyr Val Phe Leu Tyr Glu Leu Val Lys Gin Thr Arg Asp 
290 295 300 



Arg Gin Val Leu Arg Asp Gin Leu Leu Asn He Met Val Ala Gly Arg 
305 310 315 320 



Asp Thr Thr Ala Gly Leu Leu Ser Phe Val Phe Phe Glu Leu Ala Arg 
325 330 335 



Asn Pro Glu Val Thr Asn Lys Leu Arg Glu Glu He Glu Asp Lys Phe 
340 345 350 



Gly Leu Gly Glu Asn Ala Arg Val Glu Asp He Ser Phe Glu Ser Leu 
355 360 365 



Lys Ser Cys Glu Tyr Leu Lys Ala Val Leu Asn Glu Thr Leu Arg Leu 
370 375 380 
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Tyr Pro Ser Val Pro Gin Asn Phe Arg Val Ala Thr Lys Asn Thr Thr 
385 390 395 400 



Leu Pro Arg Gly Gly Gly Lys Asp Gly Leu Ser Pro Val Leu Val Arg 
405 410 415 



Lys Gly Gin Thr Val Met Tyr Gly Val Tyr Ala Ala His Arg Asn Pro 
420 425 430 



Ala Val Tyr Gly Lys Asp Ala Leu Glu Phe Arg Pro Glu Arg Trp Phe 
435 440 445 



Glu Pro Glu Thr Lys Lys Leu Gly Trp Ala Phe Leu Pro Phe Asn Gly 
450 455 460 



Gly Pro Arg He Cys Leu Gly Gin Gin Phe Ala Leu Thr Glu Ala Ser 
465 470 475 480 



Tyr Val Thr Val Arg Leu Leu Gin Glu Phe Gly His Leu Ser Met Asp 
485 490 495 



Pro Asn Thr Glu Tyr Pro Pro Arg Lys Met Ser His Leu Thr Met Ser 
500 505 510 



Leu Phe Asp Gly Ala Asn He Glu Met Tyr 
515 520 

<210> 98 
<211> 540 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 98 

Met Ser Ser S^r Pro Ser Phe Ala Gin Glu Val Leu Ala Thr Thr Ser 
15 10 15 



Pro Tyr He Glu Tyr Phe Leu Asp Asn Tyr Thr Arg Trp Tyr Tyr Phe 
20 25 30 



He Pro Leu Val Leu Leu Ser Leu Asn Phe He Ser Leu Leu His Thr 
35 40 45 



Arg Tyr Leu Glu Arg Arg Phe His Ala Lys Pro Leu Gly Asn Phe Val 
50 55 60 
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Arg Asp Pro Thr Phe Giy He Ala Thr Pro Leu Leu Leu He Tyr Leu 
65 70 75 80 



Lys Ser Lys Gly Thr Val Met Lys Phe Ala Trp Gly Leu Trp Asn Asn 
85 90 • 95 



Lys Tyr He Val Arg Asp Pro Lys Tyr Lys Thr Thr Gly Leu Arg He 
100 105 110 



Val Gly Leu Pro Leu He Glu Thr Met Asp Pro Glu Asn He Lys Ala 
115 120 125 



Val Leu Ala Thr Gin Phe Asn Asp Phe Ser Leu Gly Thr Arg His Asp 
130 135 140 



Phe Leu Tyr Ser Leu Leu Gly Asp Gly He Phe Thr Leu Asp Gly Ala 
145 150 155 160 



Gly Trp Lys His Ser Arg Thr Met Leu Arg Pro Gin Phe Ala Arg Glu 
165 170 175 



Gin Val Ser His Val Lys Leu Leu Glu Pro His Val Gin Val Phe Phe 
180 185 190 



Lys His Val Arg Lys His Arg Gly Gin Thr Phe A^ He Gin Glu Leu 
195 200 205 



Phe Phe Arg Leu Thr Val Asp Ser Ala Thr Glu Phe Leu Phe Gly Glu 
210 ^ 215 220 



Ser Ala Glu Ser Leu Arg Asp Glu Ser He Gly Leu Thr Pro Thr Thr 
225 230 235 240 



Lys Asp Phe Asp Gly Arg Arg Asp Phe Ala Asp Ala Phe Asn Tyr Ser 
245 250 255 



Gin Thr Tyr Gin Ala Tyr Arg Phe Leu Leu Gin Gin Met Tyr Trp He 
260 265 270 



Leu Asn Gly Ser Glu Phe Arg Lys Ser He Ala Val Val His Lys Phe 
275 280 285 



Ala Asp His Tyr Val Gin Lys Ala Leu Glu Leu Thr Asp Asp Asp Leu 
290 . 295 300 
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Gin Lys Gin Asp Gly Tyr Val Phe Leu Tyr Glu Leu Ala Lys Gin Thr 
305 310 315 320 



Arg Asp Pro Lys Val Leu Arg Asp Gin Leu Leu Asn lie Leu Val Ala 
325 330 335 



Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Val Phe- Tyr Glu Leu 
340 345 350 



Ser Arg Asn Pro Glu Val Phe Ala Lys Leu Arg Glu Glu Val Glu Asn 
355 360 365 



Arg Phe Gly Leu Gly Glu Glu Ala Arg Val Glu Glu He Ser Phe Glu 
370 375 380 



Ser Leu Lys Ser Cys Glu Tyr Leu Lys Ala Val He Asn Glu Thr Leu 
385 390 395 400 



Arg Leu Tyr Pro Ser Val Pro His Asn Phe Arg Val Ala Thr Arg Asn 
405 410 415 



Thr Thr Leu Pro Arg Gly Gly Gly Glu Asp Gly Tyr Ser Pro He Val 
420 425 430 



Val Lys Lys "Gly Gin Val Val Met Tyr Thr Val He Ala Thr His Arg 
435 440 445 



Asp Pro Ser He Tyr Gly Ala Asp Ala Asp Val Phe Arg Pro Glu Arg 
450 455 460 



Trp Phe Glu Pro Glu Thr Arg Lys Leu Gly Trp Ala Tyr Val Pro Phe 
465 ' 470 475 480 



Asn Gly Gly Pro Arg He Cys Leu Gly Gin Gin Phe Ala Leu Thr Glu 
485 490 495 



Ala Ser Tyr Val Thr Val Arg Leu Leu Gin Glu Phe Ala His Leu Ser 
500 505 510 



Met Asp Pro Asp Thr Glu Tyr Pro Pro Lys Leu Gin Asn Thr Leu Thr 
515 520 525 



Leu Ser Leu Phe Asp Gly Ala Asp Val Arg Met Tyr 
530 535 540 



<210> 99 
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<211> 540 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 99 

Met Ser Ser Ser Pro Ser Phe Ala Gin Glu Val Leu Ala Thr Thr Ser 
15 10 15 



Pro Tyr lie Glu Tyr Phe Leu Asp Asn Tyr Thr Arg Trp Tyr Tyr Phe 
20 25 30 



lie Pro Leu Val Leu Leu Ser Leu Asn Phe lie Ser Leu Leu His Thr 
35 40 45 



Lys Tyr Leu Glu Arg Arg Phe His Ala Lys Pro Leu Gly Asn Val Val 
50 55 60 



Leu Asp Pro Thr Phe Gly lie Ala Thr Pro Leu He Leu He Tyr Leu 
65 ' 70 75 80 



Lys Ser Lys Gly Thr Val Met Lys Phe Ala Trp Ser Phe Trp Asn Asn 
85 90 95 



Lys Tyr He Val Lys Asp Pro Lys Tyr Lys Thr Thr Gly Leu Arg lie 
100 105 110 



Val Gly Leu Pro Leu He Glu Thr He Asp Pro Glu Asn He Lys Ala 
115 120 125 



Val Leu Ala Thr Gin Phe Asn Asp Phe Ser Leu Gly Thr Arg His Asp 
130 135 140 



Phe Leu Tyr Ser Leu Leu Gly Asp Gly He Phe Thr Leu Asp Gly Ala 
145 150 155 160 



Gly Trp Lys His Ser Arg Thr Met Leu Arg Pro Gin Phe Ala Arg Glu 
165 170 175 



Gin Val Ser His Val Lys Leu Leu Glu Pro His Val Gin Val Phe Phe 
180 185 190 



Lys His Val Afg Lys His Arg Gly Gin Thr Phe Asp He Gin Glu Leu 
195 200 205 



Phe Phe Arg Leu Thr Val Asp Ser Ala Thr Glu Phe Leu Phe Gly Glu 
210 215 220 
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Ser Ala Glu Ser Leu Arg Asp Asp Ser Val Gly Leu Thr Pro Thr Thr 
225 230 235 240 



Lys Asp Phe Glu Gly Arg Gly Asp Phe Ala Asp Ala Phe Asn Tyr Ser 
245 250 255 



Gin Thr Tyr Gin Ala Tyr Arg Phe Leu Leu Gin Gin Met Tyr Trp He 
260 265 270 



Leu Asn Gly Ala Glu Phe Arg Lys Ser He Ala He Val His Lys Phe 
275 280 285 



Ala Asp His Tyr Val Gin Lys Ala Leu Glu Leu Thr Asp Asp Asp Leu 
290 295 300 



Gin Lys Gin Asp Gly Tyr Val Phe Leu Tyr Glu Leu Ala Lys Gin Thr 
305 310 315 320 



Arg Asp Pro Lys Val Leu Arg Asp Gin Leu Leu Asn lie Leu Val Ala 
^ 325 330 335 



Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Val Phe Tyr Glu Leu 
340 345 350 



Ser Arg Asn Pro Glu Val Phe Ala Lys Leu Arg Glu Glu Val Glu Asn 
355 360 365 



Arg Phe Gly Leu Gly Glu Glu Ala Arg Val Glu Glu He Ser Phe Glu 
370 375 380 



Ser Leu Lys Ser Cys Glu Tyr Leu Lys Ala Val He Asn Glu Ala Leu 
385 390 395 400 



Arg Leu Tyr Pro Ser Val Pro His Asn Phe Arg Val Ala Thr Arg Asn 
. 405 410 415 



Thr Thr Leu Pro Arg Gly Gly Gly Lys Asp Gly Cys Ser Pro He Val 
420 425 430 



Val Lys Lys Gly Gin Val Val Met Tyr Thr Val He Gly Thr His Arg 
435 440 445 



Asp Pro Ser He Tyr Gly Ala Asp Ala Asp Val Phe Arg Pro Glu Arg 
450 455 460 
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Trp Phe Glu Pro Glu Thr Arg Lys Leu Gly Trp Ala Tyr Val Pro Phe 
465 470 475 480 



Asn Gly Gly Pro Arg He Cys Leu Gly Gin Gin Phe Ala Leu Thr Glu 
485 490 495 



Ala Ser Tyr Val Thr Val Arg Leu Leu Gin Glu Phe Gly Asn Leu Ser 
500 505 510 



Leu Asp Pro Asn Ala Glu Tyr Pro Pro Lys Leu Gin Asn Thr Leu Thr 
515 520 525 



Leu Ser Leu Phe Asp Gly Ala Asp Val Arg Met Phe 
530 535 540 

<210> 100 
<211> 517 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 100 

Met He Glu Gin Leu Leu Glu Tyr Trp Tyr Val Val Val Pro Val Leu 
15 10 15 



Tyr He lie Lys Gin Leu Leu Ala Tyr Thr Lys Thr Arg Val Leu Ret 
20 25 30 



Lys Lys Leu Gly Ala Ala Pro Val Thr Asn Lys Leu Tyr Asp Asn Ala 
35 40 45 



Phe Gly He Val Asn Gly Trp Lys Ala Leu Gin Phe Lys Lys Glu Gly 
50 55 60 



Arg Ala Gin Glu Tyr Asn Asp Tyr Lys Phe Asp His Ser Lys Asn Pro 
65 70 75 80 



Ser Val Gly Thr Tyr Val Ser He Leu Phe Gly Thr Arg He Val Val 
85 90 95 



Thr Lys Asp Pro Glu Asn He Lys Ala He Leu Ala Thr Gin Phe Gly 
100 105 110 



Asp Phe Ser Leu Gly Lys Arg His Thr Leu Phe Lys Pro Leu Leu Gly 
115 120 125 
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Asp Gly lie Phe Thr Leu Asp Gly Glu Gly Trp Lys His Ser Arg Ala 
130 135 140 



Met Leu Arg Pro Gin Phe Ala Arg Glu Gin Val Ala His Val Thr Ser 
145 150 155 - 160 



Leu Glu Pro His Phe Gin Leu Leu Lys Lys His lie Leu Lys His Lys 
165 170 175 



Gly Glu Tyr Phe Asp lie Gin Glu Leu Phe Phe Arg Phe Thr Val Asp 
180 185 190 



Ser Ala Thr Glu Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp 
195 200 205 



Glu Ser lie Gly He Asn Gin Asp Asp He Asp Phe Ala Gly Arg Lys 
210 215 220 



Asp Phe Ala Glu Ser Phe Asn Lys Ala Gin Glu Tyr Leu Ala He Arg 
225 230 235 240 



Thr Leu Val Gin Thr Phe Tyr Trp Leu Val Asn Asn Lys Glu Phe Arg 
245 250 255 



Asp Cys Thr Lys Leu Val His Lys Phe Thr Asn Tyr Tyr Val Gin Lys 
260 265 270 



Ala Leu Asp Ala Ser Pro Glu Glu Leu Glu Lys Gin Ser Gly Tyr Val 
275 280 285 



Phe Leu Tyr Glu Leu Val Lys Gin Thr Arg Asp Pro Asn Val Leu Arg 
290 295 300 



Asp Gin Ser Leu Asn He Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly 
305 310 315 320 



Leu Leu Ser Phe Ala Val Phe Glu Leu Ala Arg His Pro Glu He Trp 
325 330 335 



Ala Lys Leu Arg Glu Glu He Glu Gin Gin Phe Gly Leu Gly Glu Asp 
340 345 350 



Ser Arg Val Glu Glu He Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr 
355 360 365 
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Leu Lys Ala Phe Leu Asn Glu Thr Leu Arg lie Tyr Pro Ser Val Pro 
370 375 380 



Arg Asn Phe Arg lie Ala Thr Lys Asn Thr Thr Leu Pro Arg Gly Gly 
385 390 395 ' 400 



Gly Ser Asp Gly Thr Ser Pro He Leu He Gin Lys Gly- Glu Ala Val 
405 410 415 



Ser Tyr Gly He Asn Ser Thr His Leu Asp Pro Val Tyr Tyr Gly Pro 
420 425 430 



Asp Ala Ala Glu Phe Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Lys 
435 440 445 



Lys Leu Gly Trp Ala Tyr Leu Pro Phe Asn Gly Gly Pro Arg He Cys 
450 455 460 



Leu Gly Gin Gin Phe Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg 
465 470 475 480 



Leu Val Gin Glu Phe Ser His Val Arg Leu Asp Pro Asp Glu Val Tyr 
485 490 495 



Pro Pro Lys Arg Leu Thr Asn Leu Thr Met Cys Leu Gin Asp Gly Ala 
500 505 510 



He Val Lys Phe Asp 
515 

<210> 101 
<211> 517 
<212> PRT 

< 2 1 3 > CAN D I DAT RO P I C AL I S 
<400> 101 

Met He Glu G^n He Leu Glu Tyr Trp Tyr He Val Val Pro Val Leu 
15 10 15 



Tyr He He Lys Gin Leu He Ala Tyr Ser Lys Thr Arg Val Leu Met 
20 25 30 



Lys Gin Leu Gly Ala Ala Pro He Thr Asn Gin Leu Tyr Asp Asn Val 
35 40 45 



Phe Gly He Val Asn Gly Trp Lys Ala Leu Gin Phe Lys Lys Glu Gly 
50 55 60 
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Arg Ala Gin Glu Tyr Asn Asp His Lys Phe Asp Ser Ser Lys Asn Pro 
65 70 75 80 



Ser Val Gly Thr Tyr Val Ser He Leu Phe Gly Thr Lys He Val Val 
85 90 95 



Thr Lys Asp Pro Glu Asn He Lys Ala He Leu Ala Thr Gin Phe Gly 
100 105 110 



Asp Phe Ser Leu Gly Lys Arg His Ala Leu Phe Lys Pro Leu Leu Gly 
115 120 125 



Asp Gly He Phe Thr Leu Asp Gly Glu Gly Trp Lys His Ser Arg Ser 
130 135 140 



Met Leu Arg Pro Gin Phe Ala Arg Glu Gin Val Ala His Val Thr Ser 
145 150 155 160 



Leu Glu Pro His Phe Gin Leu Leu Lys Lys His He Leu Lys His Lys 
165 170 175 



Gly Glu Tyr Phe Asp He Gin Glu Leu Phe Phe Arg Phe Thr Val Asp 
180 185 190 



Ser Ala Thr Glu Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp 
195 200 205 



Glu Thr He Gly He Asn Gin Asp Asp He Asp Phe Ala Gly Arg Lys 
210 215 220 



Asp Phe Ala Glu Ser Phe Asn Lys Ala Gin Glu Tyr Leu Ser He Arg 
225 230 235 240 



He Leu Val Gin Thr Phe Tyr Trp Leu He Asn Asn Lys Glu Phe Arg 
245 250 255 



Asp Cys Thr Lys Leu Val His Lys Phe Thr Asn Tyr Tyr *Val Gin Lys 
260 265 270 



Ala Leu Asp Ala Thr Pro Glu Glu Leu Glu Lys Gin Gly Gly Tyr Val 
275 280 285 



Phe Leu Tyr Glu Leu Val Lys Gin Thr Arg Asp Pro Lys Val Leu Arg 
290 295 300 
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Asp Gin Ser Leu Asn lie Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly 
305 310 315 320 



Leu Leu Ser Phe Ala Val Phe Glu Leu Ala Arg Asn Pro His lie Trp 
325 330 335 



Ala Lys Leu Arg Glu Glu lie Glu Gin Gin Phe Gly Leu Gly Glu Asp 
340 345 350 



Ser Arg Val Glu Glu lie Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr 
355 360 365 



Leu Lys Ala Phe Leu Asn Glu Thr Leu Arg Val Tyr Pro Ser Val Pro 
370 375 380 



Arg Asn Phe Arg lie Ala Thr Lys Asn Thr Thr Leu Pro Arg Gly Gly 
385 390 395 400 



Gly Pro Asp Gly Thr Gin Pro lie Leu lie Gin Lys Gly Glu Gly Val 
405 410 415 



Ser Tyr Gly lie Asn Ser Thr His Leu Asp Pro Val Tyr Tyr Gly Pro 
420 425 430 



Asp Ala Ala Glu Phe Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg 
435 440 445 



Lys Leu Gly Trp Ala Tyr Leu Pro Phe Asn Gly Gly Pro Arg lie Cys 
450 455 460 



Leu Gly Gin Gin Phe Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg 
465 ^ 470 475 480 



Leu Val Gin Glu Phe Ser His He Arg Leu Asp Pro Asp Glu Val Tyr 
485 490 495 



Pro Pro Lys Arg Leu Thr Asn Leu Thr Met Cys Leu Gin Asp Gly Ala 
500 505 510 



He Val Lys Phe Asp 
515 



<210> 102 
<211> 512 
<212> PRT 
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<213> CANDIDATROPICALIS 



<400> 102 

Met Leu Asp Gin lie Leu His Tyr Trp Tyr lie Val Leu Pro Leu Leu 
1 5 10-15 



Ala He He Asn Gin He Val Ala His Val Arg Thr Asa Tyr Leu Met 
20 25 30 



Lys Lys Leu Gly Ala Lys Pro Phe Thr His Val Gin Arg Asp Gly Trp 
35 40 45 



Leu Gly Phe Lys Phe Gly Arg Glu Phe Leu Lys Ala Lys Ser Ala Gly 
50 55 60 



Arg Leu Val Asp Leu lie lie Ser Arg Phe His Asp Asn Glu Asp Thr 
65 70 75 80 



Phe Ser Ser Tyr Ala Phe Gly Asn His Val Val Phe Thr Arg Asp Pro 
85 90 95 



Glu Asn He Lys Ala Leu Leu Ala Thr Gin Phe Gly Asp Phe Ser Leu 
100 105 110 



Gly Ser Arg Val Lys Phe Phe Lys Pro Leu Leu ^Gly Tyr Gly He Phe 
115 120 125 



Thr Leu Asp Ala Glu Gly Trp Lys His Ser Arg Ala Met Leu Arg Pro 
130 135 140 



Gin Phe Ala Arg Glu Gin Val Ala His Val Thr Ser Leu Glu Pro His 
145 150 155 160 



Phe Gin Leu Leu Lys Lys His He Leu Lys His Lys Gly Glu Tyr Phe 
165 170 175 



Asp He Gin Glu Leu Phe Phe Arg Phe Thr Val Asp Ser Ala Thr Glu 
180 185 190 



Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp Glu Glu He Gly 
195 200 205 



Tyr Asp Thr Lys Asp Met Ser Glu Glu Arg Arg Arg Phe Ala Asp Ala 
210 215 220 



-150- 



Phe Asn Lys Ser Gin Val Tyr Val Ala Thr Arg Val Ala Leu Gin Asn 
225 230 235 240 



Leu Tyr Trp Leu Val Asn Asn Lys Glu Phe Lys Glu Cys Asn Asp He 
245 250 - 255 



Val His Lys Phe Thr Asn Tyr Tyr Val Gin Lys Ala Leu Asp Ala Thr 
260 265 270 



Pro Glu Glu Leu Glu Lys Gin Gly Gly Tyr Val Phe Leu Tyr Glu Leu 
275 280 285 



Val Lys Gin Thr Arg Asp Pro Lys Val Leu Arg Asp Gin Ser Leu Asn 
290 295 300 



He Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala 
305 310 315 320 



Val Phe Glu Leu Ala Arg Asn Pro His He Trp Ala Lys Leu Arg Glu 
325 330 335 



Glu He Glu Gin Gin Phe Gly Leu Gly Glu Asp Ser Arg Val Glu Glu 
340 345 350 



He Thr Phe'^ Glu Ser Leu Lys Arg Cys Glu Tyr Leu Lys Ala Val Leu 
355 360 365 



Asn Glu Thr Leu Arg Leu His Pro Ser Val Pro Arg Asn Ala Arg Phe 
370 375 380 



Ala He Lys Asp Thr Thr Leu Pro Arg Gly Gly Gly Pro Asn Gly Lys 
385 390 395 400 



Asp Pro He Leu He Arg Lys Asp Glu Val Val Gin Tyr Ser He Ser 
405 410 415 



Ala Thr Gin Thr Asn Pro Ala Tyr Tyr Gly Ala Asp Ala Ala Asp Phe 
420 425 430 



Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg Asn Leu Gly Trp Ala 
435 440 445 



Phe Leu Pro Phe Asn Gly Gly Pro Arg He Cys Leu Gly Gin Gin Phe 
450 455 460 
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Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg Leu Val Gin Glu Phe 
465 470 475 480 



Pro Asn Leu Ser Gin Asp Pro Glu Thr Lys Tyr Pro Pro Pro Arg Leu 
485 490 495 



Ala His Leu Thr Met Cys Leu Phe Asp Gly Ala His Val- Lys Met Ser 
500 505 510 

<210> 103 
<211> 512 
<212> PRT 

< 2 1 3 > CAND I DATROP 1 CALX S 
<400> 103 

Met Leu Asp Gin lie Phe His Tyr Trp Tyr lie Val Leu Pro Leu Leu 
15 10 15 



Val He He Lys Gin He Val Ala His Ala Arg Thr Asn Tyr Leu Met 
20 25 30 



Lys Lys Leu Gly Ala Lys Pro Phe Thr His Val Gin Leu Asp Gly Trp 
35 40 45 



Phe Gly Phe Lys Phe Gly Arg Glu Phe Leu Lys Ala Lys Ser Ala Gly 
50 55 60 



Arg Gin Val Asp Leu He He Ser Arg Phe His Asp Asn Glu Asp Thr 
65 70 75 80 



Phe Ser Ser Tyr Ala Phe Gly Asn His Val Val Phe Thr Arg Asp Pro 
85 90 95 



Glu Asn He Lys Ala Leu Leu Ala Thr Gin Phe Gly Asp Phe Ser Leu 
100 105 110 



Gly Ser Arg Val Lys Phe Phe Lys Pro Leu Leu Gly Tyr Gly He Phe 
115 120 125 



Thr Leu Asp Gly Glu Gly Trp Lys His Ser Arg Ala Met Leu Arg Pro 
130 135 140 



Gin Phe Ala Arg Glu Gin Val Ala His Val Thr Ser Leu Glu Pro His 
145 150 155 160 



Phe Gin Leu Leu Lys Lys His He Leu Lys His Lys Gly Glu Tyr Phe 
165 170 175 



-152- 



Asp He Gin Glu Leu Phe Phe Arg Phe Thr Val Asp Ser Ala Thr Glu 
180 185 190 



Phe Leu Phe Gly Glu Ser Val His Ser Leu Arg Asp Glu Glu He Gly 
195 200 205 



Tyr Asp Thr Lys Asp Met Ala Glu Glu Arg Arg Lys Phe Ala Asp Ala 
210 215 220 



Phe Asn Lys Ser Gin Val Tyr Leu Ser Thr Arg Val Ala Leu Gin Thr 
225 230 235 240 



Leu Tyr Trp Leu Val Asn Asn Lys Glu Phe Lys Glu Cys Asn Asp He 
245 250 255 



Val His Lys Phe Thr Asn Tyr Tyr Val Gin Lys Ala Leu Asp Ala Thr 
260 265 270 



Pro Glu Glu Leu Glu Lys Gin Gly Gly Tyr Val Phe Leu Tyr Glu Leu 
275 280 285 



Ala Lys Gin Thr Lys Asp Pro Asn Val Leu Arg Asp Gin Ser Leu Asn 
290 295 300 



He Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala 
305 310 315 320 



Val Phe Glu Leu Ala Arg Asn Pro His He Trp Ala Lys Leu Arg Glu 
325 330 335 



Glu He Glu Ser His Phe Gly Leu Gly Glu Asp Ser Arg Val Glu Glu 
340 345 350 



He Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr Leu Lys Ala Val Leu 
355 360 365 



Asn Glu Thr Leu Arg Leu His Pro Ser Val Pro Arg Asn Ala Arg Phe 
370 375 ^ 380 



Ala He Lys Asp Thr Thr Leu Pro Arg Gly Gly Gly Pro Asn Gly Lys 
385 390 395 400 



Asp Pro He Leu He Arg Lys Asn Glu Val Val Gin Tyr Ser He Ser 
405 410 415 
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Ala Thr Gin Thr Asn Pro Ala Tyr Tyr Gly Ala Asp Ala Ala Asp Phe 
420 425 430 



Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg Asn Leu Gly Trp Ala 
435 440 445 



Tyr Leu Pro Phe Asn Gly Gly Pro Arg lie Cys Leu Gly Gin Gin Phe 
450 455 460 



Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg Leu Val Gin Glu Phe 
465 470 475 480 



Pro Ser Leu Ser Gin Asp Pro Glu Thr Glu Tyr Pro Pro Pro Arg Leu 
485 490 495 



Ala His Leu Thr Met Cys Leu Phe Asp Gly Ala Tyr Val Lys Met Gin 
500 505 510 

<210> 104 
<211> 499 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 104 

Met Ala He Ser Ser Leu Leu Ser Trp Asp Val He Cys Val Val Phe 
15 10 15 



He Cys Val Cys Val Tyr Phe Gly Tyr Glu Tyr Cys Tyr Thr Lys Tyr 
20 25 30 



Leu Met His Lys His Gly Ala Arg Glu He Glu Asn Val He Asn Asp 
35 40 45 



Gly Phe Phe Gly Phe Arg Leu Pro Leu Leu Leu Met Arg Ala Ser Asn 
50 55 60 



Glu Gly Arg Leu He Glu Phe Ser Val Lys Arg Phe Glu Ser Ala Pro 
65 70 75 80 



His Pro Gin Asn Lys Thr Leu Val Asn Arg Ala Leu Ser Val Pro Val 
85 90 95 



He Leu Thr Lys Asp Pro Val Asn He Lys Ala Met Leu Ser Thr Gin 
100 105 110 
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Phe Asp Asp Phe Ser Leu Gly Leu Arg Leu His Gin Phe Ala Pro Leu 
115 , 120 125 



Leu Gly Lys Gly He Phe Thr Leu Asp Gly Pro Glu Trp Lys Gin Ser 
130 135 140 



Arg Ser Met Leu Arg Pro Gin Phe Ala Lys Asp Arg VaL Ser His He 
145 150 155 160 



Leu Asp Leu Glu Pro His Phe Val Leu Leu Arg Lys His He Asp Gly 
165 170 175 



His Asn Gly Asp Tyr Phe Asp He Gin Glu Leu Tyr Phe Arg Phe Ser 
180 185 190 



Met Asp Val Ala Thr Gly Phe Leu Phe Gly Glu Ser Val Gly Ser Leu 
195 200 205 



Lys Asp Glu Asp Ala Arg Phe Leu Glu Ala Phe Asn Glu Ser Gin Lys 
210 215 220 



Tyr Leu Ala Thr Arg Ala Thr Leu His Glu Leu Tyr Phe Leu Cys Asp 
225 230 235 240 



Gly Phe Arg Phe Arg Gin Tyt Asn Lys Val Val Arg Lys Phe Cys Ser 
245 250 255 



Gin Cys Val His Lys Ala Leu Asp Val Ala Pro Glu Asp Thr Ser Glu 
260 265 270 



Tyr Val Phe Leu Arg Glu Leu Val Lys His Thr Arg Asp Pro Val Val 
275 280 285 



Leu Gin Asp Gin Ala Leu Asn Val Leu Leu Ala Gly Arg Asp Thr Thr 
290 295 300 



Ala Ser Leu Leu Ser Phe Ala Thr Phe Glu Leu Ala Arg Asn Asp His 
305 310 315 320 



Met Trp Arg Lys Leu Arg Glu Glu Val He Leu Thr Met Gly Pro Ser 
325 330 335 



Ser Asp Glu He Thr Val Ala Gly Leu Lys Ser Cys Arg Tyr Leu Lys 
340 345 350 
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Ala lie Leu Asn Glu Thr Leu Arg Leu Tyr Pro Ser Val Pro Arg Asn 
355 360 365 



Ala Arg Phe Ala Thr Arg Asn Thr Thr Leu Pro Arg Gly Gly Gly Pro 
370 ^ 375 380 



Asp Gly Ser Phe Pro lie Leu lie Arg Lys Gly Gin Pro ^ Val Gly Tyr 
385 390 395 400 



Phe lie Cys Ala Thr His Leu Asn Glu Lys Val Tyr Gly Asn Asp Ser 
405 410 415 



His Val Phe Arg Pro Glu Arg Trp Ala Ala Leu Glu Gly Lys Ser Leu 
420 425 430 



Gly Trp Ser Tyr Leu Pro Phe Asn Gly Gly Pro Arg Ser Cys Leu Gly 
435 440 445 



Gin Gin Phe Ala lie Leu Glu Ala Ser Tyr Val Leu Ala Arg Leu Thr 
450 , 455 460 



Gin Cys Tyr Thr Thr lie Gin Leu Arg Thr Thr Glu Tyr Pro Pro Lys 
465 470 475 480 



Lys Leu Val His Leu Thr Met Ser Leu Leu Asn Gly Val Tyr lie Arg 
485 490 495 



Thr Arg Thr 



<210> 105 

<211> 1712 

<212> DNA 

<213> Candida tropicalis 

<400> 105 



ggtaccgagc 


tcacgagttt 


tgggattttc 


gagtttggat 


tgtttccttt 


gttgattgaa 


60 


ttgacgaaac 


cagaggtttt 


caagacagat 


aagattgggt 


ttatcaaaac 


gcagtttgaa 


120 


atattccagt 


tggtttccaa 


gatatcttga 


agaagattga 


cgatttgaaa 


tttgaagaag 


180 


tggagaagat 


ctggtttgga 


ttgttggaga 


atttcaagaa 


tctcaagatt 


tactctaacg 


240 


acgggtacaa 


cgagaattgt 


attgaattga 


tcaagaacat 


gatcttggtg 


ttacagaaca 


300 


tcaagttctt 


ggaccagact 


gagaatgcca 


cagatataca 


aggcgtcatg 


tgataaaatg 


360 


gatgagattt 


atcccacaat 


tgaagaaaga 


gtttatggaa 


agtggtcaac 


cagaagctaa 


420 
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yv^ciciciwycLciy 


dyy L.yci.ddOd 


dyddyddydd 


rr rrl" a a a i" a a cr 
yy L.ddd L.ddy 


tattttgtat 


480 


Let UCtl-ClClk.'ClCL 


CL d u o. y ucLCLy 


ydd udv^dydu 


■!~"}"ai~ar'aa+*a 
K. udUdv^ddLd 


;^ 3 "t" t* nnpat a 


ctagtcacgt 


540 




d L. v— ' Cl L> I— I- ' V-' O 


OddV^ L.^O*vdCl 


praaaaaaaaa 
yddddddddd 


a a rf +* rra a p P 3 
ddy uyddddd 


aaaaatcaaa 


600 


v_a.cici.yci. i_ v_ 


CI d ' 1— '>_' O O d 


L.^dU.*t^CI.L.V— y u 


v.'d uoddd^^v^ 


ppa (TP I" pa at 


t cgcaatggt 


660 




CL^Cl L-CLOCLV^CLy 


dddyyyodUL^ 


d y L^d v_ d ^ (— v_y o 


t pp;^P5 rrrrt t a 

L. ou»ddy y^L ^ J 


cccaacQttt 

V— ' Vw-' w ^ w 


720 


p1"t"rr'npt'1"a 


3 1" rrrra rrl" CCR 


a a a a p p a 

ddddydO^dd 


ppI" Pl~np(TPP 


t p crate aacQ 

L> Vp*^ ^ w w y t-t ^ VH 


tgaccacaac 


780 






ci— ycn^dciyL'L 


*-^yy ut^v^v—odVrf 


a t pt rrt pt Prr 


t craacraccrca 


840 




dL.i^LL'aydt— L. 


■4— /"^ /~T /-I +- o/^/T^ 

L^dy o cci^^y d 


yyyodoydu u 


rra rxr^r^frY' rt r* 
ydyou, y u uy^ 


ttrrtrrpttap 
i—v-y'.^yv^^u.y^ 


900 


r\f^ c> 


yyyLLd-Lya, 


LdULCydgyd 


Cdgy ddg li, l 


rrr''l~rfa"t" ai~prr 

youydLducy 


rr;^ ;^ a pa ppcrt 

yodci^dowy L. 


960 


yciL.yuuyv— cty 


L-dOd^ou^yy 


yyyudUd^^L^y 


yd n^y oy y oy 


■f~rmarT"f"rrapa 
uyydy uyd^d 


t P^^PO'^^aPCfP 


1020 


yt^ct\_*yyciy i_y 


dv^uyyycidyy 


y*— y uoy ULyd 


a r^c^r^'^ rra a a 

dyyyuL.yddd 


oy^yy uyt^yy 




1080 


cict.ciyyc!.ciciyy 


y y wy y l. y d 


Ly t cy goy ya. 


y t_ uyut^ydy u 


a a a rrrrr't prrt 
dddyyouv— y l. 


t crrrpcTpat qq 


1140 


4* ^ "T— ^ T* ^ ^ 

L.ydaLd,T;,d.CC 


Cy t.y ay dCy d 


ucgagaL cgc 


gaagagugat 


r" rrrrrf a rT*f~ ■H r" rr 

cyyydy un-y 


"t" rra "h t* fTrrrfh i" 
uydu L.yyyu u 


1200 


cat ccfccfca.g' 


cgggacatgg 


ggggtagaga 


agaagggttt 


gattggatca 


4- o» 4" or o« or o* 

T-Ca cgaogou 


1 9 fin 

± i. o u 


tggtgtgggg 


L rgga rga ra 


aaggcgatgc 


gttgggccag 


cagT-ai-dgga 


r'+"rr'!~"l~rTa1~rTa 

ULy cLyduyd 


1320 


yy uy y u cc cy 


-2i 4— /"T 4— --J «■ 

dCugyt.dC eg 


dugLgaTicac 


tgtcgggaga 


iT(^or4- 4~ or4* 4- or 

gggimg ui^ ug 


rr a a a a rrrra a rr 
yddddyyddy 


1380 


ciycn^L^o>_ya.y 


y Lyydyyydd 


dydydUd^^dy 


OTOf :a 4" or O" 4- or or a 

y y dX-y CT-.ggd 


'Prrrraarfrrr'a'f" 

uyyddyyod u 


ar''}~"f"rraarrarr 
d\^L uyddydy 


1440 


•3 /— » 4- /-T /-^ /-I r>r 

a.acT,gy T_c3.y 


4" 4- ^ j~r ^ "f" "5 ^ 
X. UdgSd Udd.d 


x-dT-cgtaaTza 


aaLaggLCLd 


^~a■f"aJ^a■^■ar'a 
Ld UdCd LdUd 


pt a a rrpt t pt 
OL.ddywuuv_*L. 


1500 


aggacgtcat 


tgtagtcttc 


gaagttgtct 


gctagtttag 


ttctcatgat 


ttcgaaaacc 


1560 


aataacgcaa 


tggatgtagc 


agggatggtg 


gttagtgcgt 


tcctgacaaa 


cccagagtac 


1620 


gccgcctcaa 


accacgtcac 


attcgccctt 


tgcttcatcc 


gcatcacttg 


cttgaaggta 


1680 


tccacgtacg 


agttgtaata 


caccttgaag 


aa 






1712 



<210> 106 
<211> 267 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 106 

Met Val Ser Thr Lys Thr Tyr Thr Glu Arg Ala Ser Ala His Pro Ser 
15 10 15 

Lys Val Ala Gin Arg Leu Phe Arg Leu Met Glu Ser Lys Lys Thr Asn 
20 25 30 

Leu Cys Ala Ser lie Asp Val Thr Thr Thr Ala Glu Phe Leu Ser Leu 
35 ^ 40 45 



-157- 



lie Asp Lys Leu Gly Pro His lie Cys Leu Val Lys Thr His lie Asp 
50 55 60 



lie He Ser Asp Phe Ser Tyr Glu Gly Thr He Glu Pro Leu Leu Val 
65 70 75 80 



Leu Ala Glu Arg His Gly Phe Leu He Phe Glu Asp Arg Lys Phe Ala 
85 90 95 



Asp He Gly Asn Thr Val Met Leu Gin Tyr Thr Ser Gly Val Tyr Arg 
100 105 110 



He Ala Ala Trp Ser Asp He Thr Asn Ala His Gly Val Thr Gly Lys 
115 120 125 



Gly Val Val Glu Gly Leu Lys Arg Gly Ala Glu Gly Val Glu Lys Glu 
130 135 140 



Arg Gly Val Leu Met Leu Ala Glu Leu Ser Ser Lys Gly Ser Leu Ala 
145 150 155 160 



His Gly Glu Tyr Thr Arg Glu Thr He Glu He Ala Lys Ser Asp Arg 
165 170 175 



Glu Phe Val He Gly Phe He Ala Gin Arg Asp Met Gly Gly Arg Glu 
180 185 190 



Glu Gly Phe Asp Trp He He Met Thr Pro Gly Val Gly Leu Asp Asp 
195 200 205 



Lys Gly Asp Ala Leu Gly Gin Gin Tyr Arg Thr Val Asp Glu Val Val 
210 215 220 



Leu Thr Gly Thr Asp Val He He Val Gly Arg Gly Leu Phe Gly Lys 
225 230 235 240 



Gly Arg Asp Pro Glu Val Glu Gly Lys Arg Tyr Arg Asp Ala Gly Trp 
245 250 255 



Lys Ala Tyr Leu Lys Arg Thr Gly Gin Leu Glu 
260 265 



<210> 107 
<211> 473 
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<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of 


Artificial 


Sequence: Primer 






<400> 107 
gtcaaagcaa 


attgttggcc 


caagcagact 


cttggaccac 


cgttgaatgg 


aacataagcc 


60 


cagcccaact 


tcttagtaga 


tggttcaaac 


catctttctg gtctgaagtc 


gttagcgtcc 


120 


ttaccgtagt 


attcttccaa 


acggtgggtc 


ttgtagacaa 


cgtaagcaac 


agtggagcct 


180 


ttaggaatgt 


agattgggtc 


ggtaccgtta 


^ \^ s*' \^ u \^ 


ct ct t oacaa 


agtggtgtct 


240 


ctggtggcgg 


ttctaaagtt 


gacaggaaca 


gatgggtaca 


tacgcaaggt 


ttcgttaagg 


300 


atagccttca 


agtattcaca 


tctcttcaag 


gcttcgaaag 


taatttcttc 


aacgcgggag 


360 


tcttcaccaa 


caccaaagtt 


aacttcgatt 


tcttctctca 


acttggacca 


catctctggg 


420 


tgtctagcca 


attcaaacaa 


agcaaaggac 


aacaaacccg 


cggtggtgtc 


tct 


473 


<210> 108 
<211> 540 
<212> DNA 

<213> Candida tropicalis 










<400> 108 
tactaacttg 


ttgaggatct 


tataaccata 


cagcaacacg 


gtcacaacat 


gtagtagttt 


60 


gttgaggaac 


gtatgtgttt 


ctgagcgcag 


aactactttt 


^tcaacccacg 


acgaggtcag 


120 


tgtttgttca 


acatgctgtt 


gcgaaagcca 


tagcagttac 


ctaccttccg 


agaggtcaag 


180 


ttctttctcc 


cgtcccgagt 


tctcatgttg 


ctaatgttca 


aactggtgag 


gttcttgggt 


240 


tcgcacccgt 


ggatgcagtc 


ataagaaaag 


ccgtggtcct 


agcagcactg 


gtttctaggt 


300 


ctcttatagt 


ttcgataaaa 


ccgttgggtc 


aaaccactaa 


aaagaaaccc 


gttctccgtg 


360 


tgagaaaaat 


tcggaaacaa 


tccactaccc 


tagaagtgta 


acctgccgct 


tccgaccttc 


420 


gtgtcgtctc 


ggtacaactc 


tggtgtcaaa 


cggtctcttg 


tt caacgagt 


acactgcagc 


480 


aaccttggtg 


tgaaggtcaa 


caacttcttc 


gtataagaat 


tcgtgttccc 


acttatgaaa 


540 


<210> 109 
<211> 29 
<212> DNA 

<213> Bacteriophage T7 










<400> 109 
ggatcctaat 


acgactcact 


atagggagg 








29 


<210> 110 
<211> 523 






-159- 









<212> PRT 

<2 1 3 > CANDI DATROP I CAL I S 



<400> 110 

Met Ala Thr Gin Glu He He Asp Ser Val Leu Pro Tyr Leu Thr Lys 
15 10 15 



Trp Tyr Thr Val He Thr Ala Ala Val Leu Val Phe Leu He Ser Thr 
20 25 30 



Asn He Lys Asn Tyr Val Lys Ala Lys Lys Leu Lys Cys Val Asp Pro 
35 40 45 



Pro Tyr Leu Lys Asp Ala Gly Leu Thr Gly He Ser Ser Leu He Ala 
50 55 60 



Ala He Lys Ala Lys Asn Asp Gly Arg Leu Ala Asn Phe Ala Asp Glu 
65 70 75 80 



Val Phe Asp Glu Tyr Pro Asn His Thr Phe Tyr Leu Ser Val Ala Gly 
85 90 95 



Ala Leu Lys He Val Met Thr Val Asp Pro Glu Asn He Lys Ala Val 
100 105 110 



Leu Ala Thr Gin Phe Thr Asp Phe Ser Leu Gly Thr Arg His Ala His 
115 120 125 



Phe Ala Pro Leu Leu Gly Asp Gly He Phe Thr Leu Asp Gly Glu Gly 
130 135 140 



Trp Lys His Ser Arg Ala Met Leu Arg Pro Gin Phe Ala Arg Asp Gin 
145 150 155 160 



He Gly His Val Lys Ala Leu Glu Pro His He Gin He Met Ala Lys 
165 170 175 



Gin He Lys L^u Asn Gin Gly Lys Thr Phe Asp He Gin Glu Leu Phe 
180 185 190 



Phe Arg Phe Thr Val Asp Thr Ala Thr Glu Phe Leu Phe Gly Glu Ser 
195 200 205 



Val His Ser Leu Tyr Asp Glu Lys Leu Gly He Pro Thr Pro Asn Glu 
210 215 220 
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He Pro Gly Arg Glu Asn Phe Ala Ala Ala Phe Asn Val Ser Gin His 
225 230 235 240 



Tyr Leu Ala Thr Arg Ser Tyr Ser Gin Thr Phe Tyr Phe Leu Thr Asn 
245 250 255 



Pro Lys Glu Phe Arg Asp Cys Asn Ala Lys Val His His Leu Ala Lys 
260 265 270 



Tyr Phe Val Asn Lys Ala Leu Asn Phe Thr Pro Glu Glu Leu Glu Glu 
275 280 285 



Lys Ser Lys Ser Gly Tyr Val Phe Leu Tyr Glu Leu Val Lys Gin Thr 
290 295 300 



Arg Asp Pro Lys Val Leu Gin Asp Gin Leu Leu Asn He Met Val Ala 
305 310 315 320 



Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala Leu Phe Glu Leu 
325 330 335 



Ala Arg His Pro Glu Met Trp Ser Lys Leu Arg Glu Glu He Glu Val 
340 345 350 



Asn Phe Gly Val Gly Glu Asp Ser Arg Val Glu Glu He Thr Phe Glu 
355 360 365 



Ala Leu Lys Arg Cys Glu Tyr Leu Lys Ala He Leu Asn Glu Thr Leu 
370 375 380 



Arg Met Tyr Pro Ser Val Pro Val Asn Phe Arg Thr Ala Thr Arg Asp 
385 390 395 400 



Thr Thr Leu Pro Arg Gly Gly Gly Ala Asn Gly Thr Asp Pro He Tyr 
405 410 415 



He Pro Lys Gly Ser Thr Val Ala Tyr Val Val Tyr Lys Thr His Arg 
420 425 430 



Leu Glu Glu Tyr Tyr Gly Lys Asp Ala Asn Asp Phe Arg Pro Glu Arg 
435 440 445 



Trp Phe Glu Pro Ser Thr Lys Lys Leu Gly Trp Ala Tyr Val Pro Phe 
450 455 460 
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Asn Gly Gly Pro Arg Val Cys Leu Gly Gin Gin Phe Ala Leu Thr Glu 
465 470 475 480 



Ala Ser Tyr Val He Thr Arg Leu Ala Gin Met Phe Glu Thr Val Ser 
485 490 495 



Ser Asp Pro Gly Leu Glu Tyr Pro Pro Pro Lys Cys He His Leu Thr 
500 505 510 



Met Ser His Asn Asp Gly Val Phe Val Lys Met 
515 520 

<210> 111 
<211> 540 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 111 

Met Ser Ser Ser Pro Ser Phe Ala Gin Glu Val Leu Ala Thr Thr Ser 
15 10 15 



Pro Tyr He Glu Tyr Phe Leu Asp Asn Tyr Thr Arg Trp Tyr Tyr Phe 
20 25 30 



He Pro Leu Val Leu Leu Ser Leu Asn Phe He Ser Leu Leu His Thr 
35 40 45 



Lys Tyr Leu Glu Arg Arg Phe His Ala Lys Pro Leu Gly Asn Val Val 
50 ^ 55 60 



Leu Asp Pro Thr Phe Gly He Ala Thr Pro Leu He Leu He Tyr Leu 
65 70 75 80 



Lys Ser Lys Gly Thr Val Met Lys Phe Ala Trp Ser Phe Trp Asn Asn 
85 90 95 



Lys Tyr He Val Lys Asp Pro Lys Tyr Lys Thr Thr Gly Leu Arg He 
- 100 105 110 



Val Gly Leu Pro Leu He Glu Thr He Asp Pro Glu Asn He Lys Ala 
115 120 125 



Val Leu Ala Thr Gin Phe Asn Asp Phe Ser Leu Gly Thr Arg His Asp 
130 ^ 135 140 



Phe Leu Tyr Ser Leu Leu Gly Asp Gly He Phe Thr Leu Asp Gly Ala 
145 150 155 160 
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Gly Trp Lys His Ser Arg Thr Met Leu Arg Pro Gin Phe Ala Arg Glu 
165 170 175 



Gin Val Ser His Val Lys Leu Leu Glu Pro His Val Gin Val Phe Phe 
180 185 190 



Lys His Val Arg Lys His Arg Gly Gin Thr Phe Asp lie Gin Glu Leu 
195 200 205 



Phe Phe Arg Leu Thr Val Asp Ser Ala Thr Glu Phe Leu Phe Gly Glu 
210 215 220 



Ser Ala Glu Ser Leu Arg Asp Asp Ser Val Gly Leu Thr Pro Thr Thr 
225 230 235 240 



Lys Asp Phe Glu Gly Arg Gly Asp Phe Ala Asp Ala Phe Asn Tyr Ser 
245 250 255 



Gin Thr Tyr Gin Ala Tyr Arg Phe Leu Leu Gin Gin Met Tyr Trp He 
260 265 270 



Leu Asn Gly Ala Glu Phe Arg Lys Ser lie Ala lie Val His Lys Phe 
275 280 285 



Ala Asp His Tyr Val Gin Lys Ala Leu Glu Leu Thr Asp Asp Asp Leu 
290 295 300 



Gin Lys Gin Asp Gly Tyr Val Phe Leu Tyr Glu Leu Ala Lys Gin Thr 
305 ^ 310 315 320 



Arg Asp Pro Lys Val Leu Arg Asp Gin Leu Leu Asn He Leu Val Ala 
325 330 335 



Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Val Phe Tyr Glu Leu 
340 345 350 



Ser Arg Asn Pro Glu Val Phe Ala Lys Leu Arg Glu Glu Val Glu Asn 
355 360 365 



Arg Phe Gly Leu Gly Glu Glu Ala Arg Val Glu Glu He Ser Phe Glu 
370 375 380 



Ser Leu Lys Ser Cys Glu Tyr Leu Lys Ala Val He Asn Glu Ala Leu 
385 ^ 390 395 400 
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Arg Leu Tyr Pro Ser Val Pro His Asn Phe Arg Val Ala Thr Arg Asn 
405 410 415 



Thr Thr Leu Pro Arg Gly Gly Gly Lys Asp Gly Cys Ser Pro lie Val 
420 425 430 



Val Lys Lys Gly Gin Val Val Met Tyr Thr Val He Gly Thr His Arg 
435 440 445 



Asp Pro Ser He Tyr Gly Ala Asp Ala Asp Val Phe Arg Pro Glu Arg 
450 455 460 



Trp Phe Glu Pro Glu Thr Arg Lys Leu Gly Trp Ala Tyr Val Pro Phe 
465 470 475 480 



Asn Gly Gly Pro Arg lie Cys Leu Gly Gin Gin Phe Ala Leu Thr Glu 
485 490 495 



Ala Ser Tyr Val Thr Val Arg Leu Leu Gin Glu Phe Gly Asn Leu Ser 
500 505 510 



Ser Asp Pro Asn Ala Glu Tyr Pro Pro Lys Leu Gin Asn Thr Leu Thr 
515 ^ 520 525 



Leu Ser Leu Phe Asp Gly Ala Asp Val Arg Met Phe 
530 535 540 



<210> 112 
<211> 517 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 112 

Met He Glu Gin Leu Leu Glu Tyr Trp Tyr Val Val Val Pro Val Leu 
15 10 15 



Tyr He He Lys Gin Leu Leu Ala Tyr Thr Lys Thr Arg Val Leu Met 
20 25 30 



Lys Lys Leu Gly Ala Ala Pro Val Thr Asn Lys Leu Tyr Asp Asn Ala 
35 40 45 



Phe Gly He Val Asn Gly Trp Lys Ala Leu Gin Phe Lys Lys Glu Gly 
50 55 60 
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Arg Ala Gin Glu Tyr Asn Asp Tyr Lys Phe Asp His Ser Lys Asn Pro 
65 70 75 80 



Ser Val Gly Thr Tyr Val Ser He Leu Phe Gly Thr Arg He Val Val 
85 90 95 



Thr Lys Asp Pro Glu Asn He Lys Ala He Leu Ala Thr Gin Phe Gly 
100 105 110 



Asp Phe Ser Leu Gly Lys Arg His Thr Leu Phe Lys Pro Leu Leu Gly 
115 120 125 



Asp Gly He Phe Thr Leu Asp Gly Glu Gly Trp Lys His Ser Arg Ala 
130 135 140 



Met Leu Arg Pro Gin Phe Ala Arg Glu Gin Val Ala His Val Thr Ser 
145 150 155 160 



Leu Glu Pro His Phe Gin Leu Leu Lys Lys His He Leu Lys His Lys 
165 170 175 



Gly Glu Tyr Phe Asp He Gin Glu Leu Phe Phe Arg Phe Thr Val Asp 
180 185 190 



Ser Ala Thr Glu Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp 
195 ^ 200 205 



Glu Ser He Gly He Asn Gin Asp Asp He Asp Phe Ala Gly Arg Lys 
210 215 220 



Asp Phe Ala Glu Ser Phe Asn Lys Ala Gin Glu Tyr Leu Ala He Arg 
225 230 235 240 



Thr Leu Val Gin Thr Phe Tyr Trp Leu Val Asn Asn Lys Glu Phe Arg 
245 250 255 



Asp Cys Thr Lys Ser Val His Lys Phe Thr Asn Tyr Tyr Val Gin Lys 
260 265 270 



Ala Leu Asp Ala Ser Pro Glu Glu Leu Glu Lys Gin Ser Gly Tyr Val 
275 280 285 



Phe Leu Tyr Glu Leu Val Lys Gin Thr Arg Asp Pro Asn Val Leu Arg 
290 295 300 
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Asp Gin Ser Leu Asn He Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly 
305 310 315 320 



Leu Leu Ser Phe Ala Val Phe Glu Leu Ala Arg His Pro Glu He Trp 
325 330 335 



Ala Lys Leu Arg Glu Glu He Glu Gin Gin Phe Gly Leu Gly Glu Asp 
340 345 350 



Ser Arg Val Glu Glu He Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr 
355 360 365 



Leu Lys Ala Phe Leu Asn Glu Thr Leu Arg He Tyr Pro Ser Val Pro 
370 375 380 



Arg Asn Phe Arg He Ala Thr Lys Asn Thr Thr Leu Pro Arg Gly Gly 
385 390 395 400 



Gly Ser Asp Gly Thr Ser Pro He Leu He Gin Lys Gly Glu Ala Val 
405 410 415 



Ser Tyr Gly He Asn Ser Thr His Leu Asp Pro Val Tyr Tyr Gly Pro 
420 425 430 



Asp Ala Ala Glu Phe Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Lys 
435 440 445 



Lys Leu Gly Trp Ala Tyr Leu Pro Phe Asn Gly Gly Pro Arg He Cys 
450 ^ 455 460 



Leu Gly Gin Gin Phe Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg 
465 470 475 480 



Leu Val Gin Glu Phe Ser His Val Arg Ser Asp Pro Asp Glu Val Tyr 
485 490 495 



Pro Pro Lys Arg Leu Thr Asn Leu Thr Met Cys Leu Gin Asp Gly Ala 
500 505 510 



He Val Lys Phe Asp 
515 

<210> 113 

<211> 517 

<212> PRT 

<213> CANDIDATROPICALIS 



-166- 



<400> 113 



Met He Glu Gin He Leu Glu Tyr Trp Tyr He Val Val Pro Val Leu 
15 10 15 



Tyr He He Lys Gin Leu He Ala Tyr Ser Lys Thr Arg Val Leu Met 
20 25 30 



Lys Gin Leu Gly Ala Ala Pro He Thr Asn Gin Leu Tyr Asp Asn Val 
35 40 45 



Phe Gly He Val Asn Gly Trp Lys Ala Leu Gin Phe Lys Lys Glu Gly 
50 55 60 



Arg Ala Gin Glu Tyr Asn Asp His Lys Phe Asp Ser Ser Lys Asn Pro 
65 70 75 80 



Ser Val Gly Thr Tyr Val Ser He Leu Phe Gly Thr Lys He Val Val 
85 90 95 



Thr Lys Asp Pro Glu Asn He Lys Ala He Leu Ala Thr Gin Phe Gly 
100 105 110 



Asp Phe Ser Leu Gly Lys Arg His Ala Leu Phe Lys Pro Leu Leu Gly 
115 . 120 125 



Asp Gly He Phe Thr Leu Asp Gly Glu Gly Trp Lys His Ser Arg Ser 
130 135 140 



Met Leu Arg Pro Gin Phe Ala Arg Glu Gin Val Ala His Val Thr Ser 
145 150 155 160 



Leu Glu Pro His Phe Gin Leu Leu Lys Lys His He Leu Lys His Lys 
165 170 175 



Gly Glu Tyr Phe Asp He Gin Glu Leu Phe Phe Arg Phe Thr Val Asp 
180 185 190 



Ser Ala Thr Glu Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp 
195 200 205 



Glu Thr He Gly He Asn Gin Asp Asp He Asp Phe Ala Gly Arg Lys 
210 215 220 



Asp Phe Ala Glu Ser Phe Asn Lys Ala Gin Glu Tyr Leu Ser He Arg 
225 230 235 240 
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lie Leu Val Gin Thr Phe Tyr Trp Leu lie Asn Asn Lys Glu Phe Arg 
245 250 255 



Asp Cys Thr Lys Ser Val His Lys Phe Thr Asn Tyr Tyr Val Gin Lys 
260 265 270 



Ala Leu Asp Ala Thr Pro Glu Glu Leu Glu Lys Gin Gly Gly Tyr Val 
275 280 285 



Phe Leu Tyr Glu Leu Val Lys Gin Thr Arg Asp Pro Lys Val Leu Arg 
290 295 300 



Asp Gin Ser Leu Asn He Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly 
305 310 315 320 



Leu Leu Ser Phe Ala Val Phe Glu Leu Ala Arg Asn Pro His He Trp 
325 330 335 



Ala Lys Leu Arg Glu Glu He Glu Gin Gin Phe Gly Leu Gly Glu Asp 
340 345 350 



Ser Arg Val Glu Glu He Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr 
355 360 365 

c 

Leu Lys Ala Phe Leu Asn Glu Thr Leu Arg Val Tyr Pro Ser Val Pro 
370 375 380 



Arg Asn Phe Arg He Ala Thr Lys Asn Thr Thr Leu Pro Arg Gly Gly 
385 390 395 400 



Gly Pro Asp Gly Thr Gin Pro He Leu He Gin Lys Gly Glu Gly Val 
405 410 415 



Ser Tyr Gly He Asn Ser Thr His Leu Asp Pro Val Tyr Tyr Gly Pro 
420 425 430 



Asp Ala Ala Glu Phe Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg 
435 440 445 



Lys Leu Gly Trp Ala Tyr Leu Pro Phe Asn Gly Gly Pro Arg He Cys 
450 455 460 



Leu Gly Gin Gin Phe Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg 
465 470 475 480 
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Leu Val Gin Glu Phe Ser His He Arg Ser Asp Pro Asp Glu Val Tyr 
485 490 495 



Pro Pro Lys Arg Leu Thr Asn Leu Thr Met Cys Leu Gin Asp Gly Ala 
500 505 510 



He Val Lys Phe Asp 
515 



<210> 114 
<211> 512 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 114 

Met Leu Asp Gin He Leu His Tyr Trp Tyr He Val Leu Pro Leu Leu 
15 10 15 



Ala He He Asn Gin He Val Ala His Val Arg Thr Asn Tyr Leu Met 
20 25 30 



Lys Lys Leu Gly Ala Lys Pro Phe Thr His Val Gin Arg Asp Gly Trp 
35 40 45 



Leu Gly Phe Lys Phe Gly Arg Glu Phe Leu Lys Ala Lys Ser Ala Gly 
50 55 60 



Arg Ser Val Asp Leu He He Ser Arg Phe His Asp Asn Glu Asp Thr 
65 ^ 70 75 80 



Phe Ser Ser Tyr Ala Phe Gly Asn His Val Val Phe Thr Arg Asp Pro 
85 90 95 



Glu Asn He Lys Ala Leu Leu Ala Thr Gin Phe Gly Asp Phe Ser Leu 
100 105 110 



Gly Ser Arg Val Lys Phe Phe Lys Pro Leu Leu Gly Tyr Gly He Phe 
115 120 125 



Thr Leu Asp Ala Glu Gly Trp Lys His Ser Arg Ala Met Leu Arg Pro 
130 135 140 



Gin Phe Ala Arg Glu Gin Val Ala His Val Thr Ser Leu Glu Pro His 
145 . 150 155 160 
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Phe Gin Leu Leu Lys Lys His lie Leu Lys His Lys Gly Glu Tyr Phe 
165 170 175 



Asp He Gin Glu Leu Phe Phe Arg Phe Thr Val Asp Ser Ala Thr Glu 
180 185 190 



Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp Glu Glu He Gly 
195 200 205 



Tyr Asp Thr Lys Asp Met Ser Glu Glu Arg Arg Arg Phe Ala Asp Ala 
210 215 220 



Phe Asn Lys Ser Gin Val Tyr Val Ala Thr Arg Val Ala Leu Gin Asn 
225 230 235 240 



Leu Tyr Trp Leu Val Asn Asn Lys Glu Phe Lys Glu Cys Asn Asp lie 
245 250 255 



Val His Lys Phe Thr Asn Tyr Tyr Val Gin Lys Ala Leu Asp Ala Thr 
260 265 270 



Pro Glu Glu Leu Glu Lys Gin Gly Gly Tyr Val Phe Leu Tyr Glu Leu 
275 280 285 



Val Lys Gin Thr Arg Asp Pro Lys Val Leu Arg Asp Gin Ser Leu Asn 
290 295 300 



He Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala 
305 310 315 320 



Val Phe Glu Leu Ala Arg Asn Pro His He Trp Ala Lys Leu Arg Glu 
325 330 335 



Glu lie Glu Gin Gin Phe Gly Leu Gly Glu Asp Ser Arg Val Glu Glu 
340 345 350 



He Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr Leu Lys Ala Val Leu 
355 360 365 



Asn Glu Thr Leu Arg Leu His Pro Ser Val Pro Arg Asn Ala Arg Phe 
370 375 380 



Ala He Lys Asp Thr Thr Leu Pro Arg Gly Gly Gly Pro Asn Gly Lys 
385 390 395 400 
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Asp Pro lie Leu lie Arg Lys Asp Glu Val Val Gin Tyr Ser lie Ser 
405 410 415 



Ala Thr Gin Thr Asn Pro Ala Tyr Tyr Gly Ala Asp Ala Ala Asp Phe 
420 425 430 



Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg Asn Leu Gly Trp Ala 
435 440 445 



Phe Leu Pro Phe Asn Gly Gly Pro Arg lie Cys Leu Gly Gin Gin Phe 
450 455 460 



Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg Leu Val Gin Glu Phe 
465 470 475 480 



Pro Asn Leu Ser Gin Asp Pro Glu Thr Lys Tyr Pro Pro Pro Arg Leu 
485 490 495 



Ala His Leu Thr Met Cys Leu Phe Asp Gly Ala His Val Lys Met Ser 
500 505 510 

<210> 115 
<211> 512 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 115 

Met Leu Asp Gin lie Phe His Tyr Trp Tyr lie Val Leu Pro Leu Leu 
15 10 15 



Val He He Lys Gin lie Val Ala His Ala Arg Thr Asn Tyr Leu Met 
20 25 30 



Lys Lys Leu Gly Ala Lys Pro Phe Thr His Val Gin Leu Asp Gly Trp 
35 40 45 



Phe Gly Phe Lys Phe Gly Arg Glu Phe Leu Lys Ala Lys Ser Ala Gly 
50 55 60 



Arg Gin Val Asp Leu lie lie Ser Arg Phe His Asp Asn Glu Asp Thr 
65 70 75 80 



Phe Ser Ser Tyr Ala Phe Gly Asn His Val Val Phe Thr Arg Asp Pro 
85 90 95 



Glu Asn He Lys Ala Leu Leu Ala Thr Gin Phe Gly Asp Phe Ser Leu 
100 105 110 



"in- 



Gly Ser Arg Val Lys Phe Phe Lys Pro Leu Leu Gly Tyr Gly He Phe 
115 120 125 



Thr Leu Asp Gly Giu Gly Trp Lys His Ser Arg Ala Met Leu Arg Pro 
130 135 140 



Gin Phe Ala Arg Glu Gin Val Ala His Val Thr Ser Leu Glu Pro His 
145 150 155 160 



Phe Gin Leu Leu Lys Lys His lie Leu Lys His Lys Gly Glu Tyr Phe 
165 170 175 



Asp He Gin Glu Leu Phe Phe Arg Phe Thr Val Asp Ser Ala Thr Glu 
180 185 190 



Phe Leu Phe Gly Glu Ser Val His Ser Leu Arg Asp Glu Glu He Gly 
195 200 205 



Tyr Asp Thr Lys Asp Met Ala Glu Glu Arg Arg Lys Phe Ala Asp Ala 
210 ' 215 220 



Phe Asn Lys Ser Gin Val Tyr Leu Ser Thr Arg Val Ala Leu Gin Thr 
225 230 235 240 



Leu Tyr Trp Leu Val Asn Asn Lys Glu Phe Lys Glu Cys Asn Asp He 
245 250 255 



Val His Lys Phe Thr Asn Tyr Tyr Val Gin Lys Ala Leu Asp Ala Thr 
260 265 270 



Pro Glu Glu Leu Giu Lys Gin Gly Gly Tyr Val Phe Leu Tyr Glu Leu 
275 280 285 



Ala Lys Gin Thr Lys Asp Pro Asn Val Leu Arg Asp Gin Ser Leu Asn 
290 . 295 300 



He Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala 
305 310 315 320 



Val Phe Glu Leu Ala Arg Asn Pro His He Trp Ala Lys Leu Arg Glu 
325 330 335 



Glu He Glu Ser His Phe Gly Ser Gly Glu Asp Ser Arg Val Glu Glu 
340 345 350 
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lie Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr Leu Lys Ala Val Leu 
355 360 365 



Asn Glu Thr Leu Arg Leu His Pro Ser Val Pro Arg Asn Ala Arg Phe 
370 375 380 



Ala lie Lys Asp Thr Thr Leu Pro Arg Gly Gly Gly Pro Asn Gly Lys 
385 390 395 400 



Asp Pro He Leu He Arg Lys Asn Glu Val Val Gin Tyr Ser lie Ser 
405 410 415 



Ala Thr Gin Thr Asn Pro Ala Tyr Tyr Gly Ala Asp Ala Ala Asp Phe 
420 425 430 



Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg Asn Leu Gly Trp Ala 
435 440 445 



Tyr Leu Pro Phe Asn Gly Gly Pro Arg He Cys Leu Gly Gin Gin Phe 
450 455 460 



Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg Leu Val Gin Glu Phe 
465 ^ 470 475 480 



Pro Ser Leu Ser Gin Asp Pro Glu Thr Glu Tyr Pro Pro Pro Arg Leu 
485 490 495 



Ala His Leu Thr Met Cys Leu Phe Asp Gly Ala Tyr Val Lys Met Gin 
500 505 510 

<210> 116 
<211> 499 
<212> PRT 

<213> CANDIDATROPICALIS 
<400> 116 

Met Ala He Ser Ser Leu Leu Ser Trp Asp Val lie Cys Val Val Phe 
15 10 15 



He Cys Val Cys Val Tyr Phe Gly Tyr Glu Tyr Cys Tyr Thr Lys Tyr 
20 25 30 



Leu Met His Lys His Gly Ala Arg Glu He Glu Asn Val He Asn Asp 
35 40 45 



-173- 



Gly Phe Phe Gly Phe Arg Leu Pro Leu Leu Leu Met Arg Ala Ser Asn 
50 55 60 



Glu Gly Arg Leu He Glu Phe Ser Val Lys Arg Phe Glu Ser Ala Pro 
65 70 75 ' 80 



His Pro Gin Asn Lys Thr Leu Val Asn Arg Ala Leu Ser Val Pro Val 
85 90 95 



He Leu Thr Lys Asp Pro Val Asn He Lys Ala Met Leu Ser Thr Gin 
100 105 110 



Phe Asp Asp Phe Ser Leu Gly Leu Arg Leu His Gin Phe Ala Pro Leu 
115 120 125 



Leu Gly Lys Gly He Phe Thr Leu Asp Gly Pro Glu Trp Lys Gin Ser 
130 135 140 



Arg Ser Met Leu Arg Pro Gin Phe Ala Lys Asp Arg Val Ser His He 
145 150 155 160 



Ser Asp Leu Glu Pro His Phe Val Leu Leu Arg Lys His He Asp Gly 
165 170 175 



His Asn Gly Asp Tyr Phe Asp He Gin Glu Leu Tyr Phe Arg Phe Ser 
180 185 190 



Met Asp Val Ala Thr Gly Phe Leu Phe Gly Glu Ser Val Gly Ser Leu 
195 200 205 



Lys Asp Glu Asp Ala Arg Phe Ser Glu Ala Phe Asn Glu Ser Gin Lys 
210 215 ^ 220 



Tyr Leu Ala Thr Arg Ala Thr Leu His Glu Leu Tyr Phe Leu Cys Asp 
225 230 235 240 



Gly Phe Arg Phe Arg Gin Tyr Asn Lys Val Val Arg Lys Phe Cys Ser 
245 250 255 



Gin Cys Val His Lys Ala Leu Asp Val Ala Pro Glu Asp Thr Ser Glu 
260 265 270 



Tyr Val Phe Leu Arg Glu Leu Val Lys His Thr Arg Asp Pro Val Val 
275 280 285 
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Leu Gin Asp Gin Ala Leu Asn Val Leu Leu Ala Gly Arg Asp Thr Thr 
290 295 300 



Ala Ser Leu Leu Ser Phe Ala Thr Phe Glu Leu Ala Arg Asn Asp His 
305 310 315 - 320 



Met Trp Arg Lys Leu Arg Glu Glu Val He Ser Thr Met Gly Pro Ser 
325 330 335 



Ser Asp Glu He Thr Val Ala Gly Leu Lys Ser Cys Arg Tyr Leu Lys 
340 345 350 



Ala He Leu Asn Glu Thr Leu Arg Leu Tyr Pro Ser Val Pro Arg Asn 
355 360 365 



Ala Arg Phe Ala Thr Arg Asn Thr Thr Leu Pro Arg Gly Gly Gly Pro 
370 375 380 



Asp Gly Ser Phe Pro He Leu He Arg Lys Gly Gin Pro Val Gly Tyr 
385 390 395 400 



Phe He Cys Ala Thr His Leu Asn Glu Lys Val Tyr Gly Asn Asp Ser 
405 410 415 



His Val Phe Arg Pro Glu Arg Trp Ala Ala Leu Glu Gly Lys Ser Leu 
420 425 430 



Gly Trp Ser Tyr Leu Pro Phe Asn Gly Gly Pro Arg Ser Cys Leu Gly 
435 ^ 440 445 



Gin Gin Phe Ala He Leu Glu Ala Ser Tyr Val Leu Ala Arg Leu Thr 
450 455 460 



Gin Cys Tyr Thr Thr He Gin Leu Arg Thr Thr Glu Tyr Pro Pro Lys 
465 470 475 480 



Lys Leu Val His Leu Thr Met Ser Leu Leu Asn Gly Val Tyr He Arg 
485 490 495 



Thr Arg Thr 



<210> 117 

<211> 679 

<212> PRT 

<213> CANDIDATROPICALIS 
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<400> 117 



Met Ala Leu Asp Lys Leu Asp Leu Tyr Val He He Thr Leu Val Val 
15 10 15 



Ala Val Ala Ala Tyr Phe Ala Lys Asn Gin Phe Leu Asp Gin Pro Gin 
20 25 30 



Asp Thr Gly Phe Leu Asn Thr Asp Ser Gly Ser Asn Ser Arg Asp Val 
35 40 45 



Leu Ser Thr Leu Lys Lys Asn Asn Lys Asn Thr Leu Leu Leu Phe Gly 
50 55 60 



Ser Gin Thr Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser Arg Glu 
65 70 75 80 



Leu His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp Phe Ala Asp 
. 85 90 95 



Tyr Asp Trp Asp Asn Phe Gly Asp He Thr Glu Asp He Leu Val Phe 
100 105 110 



Phe He Val Ala Thr Tyr Gly Glu Gly Glu Pro Thr Asp Asn Ala Asp 
115 120 125 



Glu Phe His Thr Trp Leu Thr Glu Glu Ala Asp Thr Leu Ser Thr Leu 
130 135 140 



Lys Tyr Thr Val Phe Gly Leu Gly Asn Ser Thr Tyr Glu Phe Phe Asn 
145 150 155 160 



Ala He Gly Arg Lys Phe Asp Arg Leu Leu Ser Glu Lys Gly Gly Asp 
165 170 175 



Arg Phe Ala Glu Tyr Ala Glu Gly Asp Asp Gly Thr Gly Thr Leu Asp 
180 185 190 



Glu Asp Phe Met Ala Trp Lys Asp Asn Val Phe Asp Ala Leu Lys Asn 
195 200 205 



Asp Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro Asn Val Lys 
210 215 220 



Leu Thr Glu Arg Asp Asp Leu Ser Ala Ala Asp Ser Gin Val Ser Leu 
225 230 235 240 
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Gly Glu Pro Asn Lys Lys Tyr He Asn Ser Glu Gly He Asp Leu Thr 
245 250 255 



Lys Gly Pro Phe Asp His Thr His Pro Tyr Leu Ala Arg He Thr Glu 
260 265 270 



Thr Arg Glu Leu Phe Ser Ser Lys Asp Arg His Cys He His Val Glu 
275 280 285 



Phe Asp He Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly Asp His Leu 
290 295 300 



Ala He Trp Pro Ser Asn Ser Asp Glu Asn He Lys Gin Phe Ala Lys 
305 310 315 320 



Cys Phe Gly Leu Glu Asp Lys Leu Asp Thr Val He Glu Leu Lys Ala 
325 330 335 



Leu Asp Ser Thr Tyr Thr He Pro Phe Pro Thr Pro He Thr Tyr Gly 
340 345 350 



Ala Val He Arg His His Leu Glu He Ser Gly Pro Val Ser Arg Gin 
355 360 365 



Phe Phe Leu Ser He Ala Gly Phe Ala Pro Asp Glu Glu Thr Lys Lys 
370 375 380 



Ala Phe Thr Arg Leu Gly Gly Asp Lys Gin Glu Phe Ala Ala Lys Val 
385 390 395 400 



Thr Arg Arg Lys Phe Asn He Ala Asp Ala Leu Leu Tyr Ser Ser Asn 
405 410 415 



Asn Ala Pro Trp Ser Asp Val Pro Phe Glu Phe Leu He Glu Asn Val 
420 425 430 



Pro His Leu Thr Pro Arg Tyr Tyr Ser He Ser Ser Ser Ser Leu Ser 
435 440 445 



Glu Lys Gin Leu He Asn Val Thr Ala Val Val Glu Ala Glu Glu Glu 
450 455 460 



Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn Leu Leu Lys Asn 
465 470 475 480 
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Val Glu He Val Gin Asn Lys Thr Gly Glu Lys Pro Leu Val His Tyr 
485 490 495 



Asp Leu Ser Gly Pro Arg Gly Lys Phe Asn Lys Phe Lys Leu Pro Val 
500 505 510 



His Val Arg Arg Ser Asn Phe Lys Leu Pro Lys Asn Ser Thr Thr Pro 
515 . 520 525 



Val He Leu He Gly Pro Gly Thr Gly Val Ala Pro Leu Arg Gly Phe 
530 535 540 



Val Arg Glu Arg Val Gin Gin Val Lys Asn Gly Val Asn Val Gly Lys 
545 550 555 560 



Thr Leu Leu Phe Tyr Gly Cys Arg Asn Ser Asn Glu Asp Phe Leu Tyr 
565 570 575 



Lys Gin Glu Trp Ala Glu Tyr Ala Ser Val Leu Gly Glu Asn Phe Glu 
580 585 590 



Met Phe Asn Ala Phe Ser Arg Gin Asp Pro Ser Lys Lys Val Tyr Val 
595 600 605 



Gin Asp Lys He Leu Glu Asn Ser Gin Leu Val His Glu Leu Leu Thr 
610 615 620 



Glu Gly Ala He He Tyr Val Cys Gly Asp Ala Ser Arg Met Ala Arg 
625 630 635 640 



Asp Val Gin Thr Thr He Ser Lys He Val Ala Lys Ser Arg Glu He 
645 650 655 



Ser Glu Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys Val Gin Asn 
660 665 670 



Arg Tyr Gin Glu Asp Val Trp 
675 



<210> 118 

<211> 679 

<212> PRT 

<213> CANDIDATROPICALIS 

<400> 118 
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Met Ala Leu Asp Lys Leu Asp Leu Tyr Val lie lie Thr Leu Val Val 
15 10 15 



Ala Val Ala Ala Tyr Phe Ala Lys Asn Gin Phe Leu Asp Gin Pro Gin 
20 25 30 



Asp Thr Gly Phe Leu Asn Thr Asp Ser Gly Ser Asn Ser Arg Asp Val 
35 40 45 



Leu Ser Thr Leu Lys Lys Asn Asn Lys Asn Thr Leu Leu Leu Phe Gly 
50 55 60' 



Ser Gin Thr Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser Arg Glu 
65 70 75 80 



Leu His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp Phe Ala Asp 
85 90 95 



Tyr Asp Trp Asp Asn Phe Gly Asp He Thr Glu Asp He Leu Val Phe 
100 105 110 



Phe He Val Ala Thr Tyr Gly Glu Gly Glu Pro Thr Asp Asn Ala Asp 
115 120 125 



Glu Phe His Thr Trp Leu Thr Glu Glu Ala Asp Thr Leu Ser Thr Leu 
130 135 140 



Arg Tyr Thr Val Phe Gly Leu Gly Asn Ser Thr Tyr Glu Phe Phe Asn 
145 150 155 160 



Ala He Gly Arg Lys Phe Asp Arg Leu Leu Ser Glu Lys Gly Gly Asp 
165 170 175 



Arg Phe Ala Glu Tyr Ala Glu Gly Asp Asp Gly Thr Gly Thr Leu Asp 
180 185 190 



Glu Asp Phe Met Ala Trp Lys Asp Asn Val Phe Asp Ala Leu Lys Asn 
195 200 205 



Asp Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro Asn Val Lys 
210 215 220 



Leu Thr Glu Arg Asp Asp Leu Ser Ala Ala Asp Ser Gin Val Ser Leu 
225 230 235 240 
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Gly Glu Pro Asn Lys Lys Tyr lie Asn Ser Glu Gly lie Asp Leu Thr 
245 250 255 



Lys Gly Pro Phe Asp His Thr His Pro Tyr Leu Ala Arg lie Thr Glu 
260 265 270 



Thr Arg Glu Leu Phe Ser Ser Lys Glu Arg His Cys lie His Val Glu 
275 280 285 



Phe Asp lie Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly Asp His Leu 
290 295 300 



Ala lie Trp Pro Ser Asn Ser Asp Glu Asn lie Lys Gin Phe Ala Lys 
305 310 315 320 



Cys Phe Gly Leu Glu Asp Lys Leu Asp Thr Val lie Glu Leu Lys Ala 
325 330 335 



Leu Asp Ser Thr Tyr Thr He Pro Phe Pro Thr Pro He Thr Tyr Gly 
340 345 350 



Ala Val He Arg His His Leu Glu He Ser Gly Pro Val Ser Arg Gin 
355 ^ 360 365 



Phe Phe Leu Ser He Ala Gly Phe Ala Pro Asp Glu Glu Thr Lys Lys 
370 375 380 



Thr Phe Thr Arg Leu Gly Gly Asp Lys Gin Glu Phe Ala Thr Lys Val 
385 390 395 400 



Thr Arg Arg Lys Phe Asn He Ala Asp Ala Leu Leu Tyr Ser Ser Asn 
405 410 415 



Asn Thr Pro Trp Ser Asp Val Pro Phe Glu Phe Leu He Glu Asn He 
420 425 430 



Gin His Leu Thr Pro Arg Tyr Tyr Ser He Ser Ser Ser Ser Leu Ser 
435 ^ 440 445 



Glu Lys Gin Leu He Asn Val Thr Ala Val Val Glu Ala Glu Glu Glu 
450 455 460 



Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn Leu Leu Lys Asn 
465 470 475 480 
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lie Glu lie Ala Gin Asn Lys Thr Gly Glu Lys Pro Leu Val His Tyr 
485 490 495 



Asp Leu Ser Gly Pro Arg Gly Lys Phe Asn Lys Phe Lys Leu Pro Val 
500 505 510 



His Val Arg Arg Ser Asn Phe Lys Leu Pro Lys Asn Ser Thr Thr Pro 
515 520 525 



Val He Leu He Gly Pro Gly Thr Gly Val Ala Pro Leu Arg Gly Phe 
530 535 540 



Val Arg Glu Arg Val Gin Gin Val Lys Asn Gly Val Asn Val Gly Lys 
545 550 555 560 



Thr Leu Leu Phe Tyr Gly Cys Arg Asn Ser Asn Glu Asp Phe Leu Tyr 
565 570 575 



Lys Gin Glu Trp Ala Glu Tyr Ala Ser Val Leu Gly Glu Asn Phe Glu 
580 585 590 



Met Phe Asn Ala Phe Ser Arg Gin Asp Pro Ser Lys Lys Val Tyr Val 
595 600 605 



Gin Asp Lys He Leu Glu Asn Ser Gin Leu Val His Glu Leu Leu Thr 
610 615 620 



Glu Gly Ala lie He Tyr Val Cys Gly Asp Ala Ser Arg Met Ala Arg 
625 630 635 640 



Asp Val Gin Thr Thr He Ser Lys He Val Ala Lys Ser Arg Glu He 
645 650 655 



Ser Glu Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys Val Gin Asn 
660 665 670 



Arg Tyr Gin Glu Asp Val Trp 
675 
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